





ate en ald 


EXISTENCE OF CONSISTENT ESTIMATES OF THE DIRECTIONAL 
PARAMETER IN A LINEAR STRUCTURAL RELATION 
BETWEEN TWO VARIABLES! 


By Jerzy NEYMAN 
University of California, Berkeley, California 


Summary. Let Z, denote the system of 8n independent pairs of measurements 
(Xx, Yu), fori = 1, 2, ---,n and k = 1, 2, --- , 8, of two nonobservable 
random variables § and x, known to satisfy a linear relation of the form 
Ex cos 6* + ny sin 6* — p = 0, where p is an arbitrary real number and 6* may 
have any value between the limits 


—hn < 6* S jx. 


The purpose of the paper is to construct a class of estimates 7,,(Z,) of the 
parameter @ defined as follows: when 6* = 32 then 6 = 0; otherwise @ = 6*. 
Each estimate 7,,(Z,) of the class considered converges in probability to @ as 
n — © under the following conditions: (i) except when @ = 0, the variables 
&% are nonnormal; (ii) any nonnormal components of the errors of measure- 
ments, Xy% — &% and Yy — nx, are mutually independent, independent of 
€, and of the normal components of these errors; (iii) the normal components 
of the errors may be correlated but as a pair are independent of £,. . 


1. Introduction. Let & and » be two random variables known to be linearly 
connected, so that there exist two numbers, 6* and p, 


(1) —4h_ < 0* S 3, —-x<p<+o, 
such that the simultaneous values of ¢ and 7 satisfy the condition 
(2) ~ cos 6* + nsin 0* — p = 0. 


We consider the case where ~ and 7 are not directly observable but where the 
observations yield the simultaneous values of two other random variables XY 
and Y, connected with £ and » by the equations 


(3) X= + U, Y=uqt V. 


Here U and V are unobservable random variables interpreted as errors in 
measuring £ and 7», respectively. Equation (2) is described as the linear structural 
relation between the variables X and Y. Throughout the paper it is assumed 
that the errors U and V may be correlated or not but, as a pair, are independent 


1 This paper was prepared with the partial support of the Office of Naval Research. It 
presents an extension of the contents of the Second Rietz Memorial Lecture delivered by 
the author at the Summer Meeting of the Institute of Mathematical Statistics at Boulder, 
Colorado, September Ist, 1949. 


497 





498 JERZY NEYMAN 


of the variables ¢ and yn. The problem considered is that of using a sequence 
{Xm, Ym} of completely independent pairs of observations on X and Y to 
construct a consistent estimate of 6*. This is an old problem and a number of 
the earlier attempts to solve it are described by Wald in an important paper [1]. 

Early attempts to obtain a consistent estimate of @* were based exclusively 
on the sample variances and covariance of X and Y. However, as early as 1916, 
Godfrey Thomson showed [2] that the same first and second moments of the 
simultaneous distribution of X and Y are compatible with an infinity of different 
values of 6* and that, therefore, attempts to estimate this parameter using only 
second order sample moments are doomed to failure. The writings of Thomson 
appear to have been overlooked and more and more studies were published 
using sample moments of the first and second orders as basic functions on which 
the estimates of 6* were built. In 1936 [3] it was pointed out that, should it 
happen that the unobservable random variables £ and 7 and also the errors U 
and V are normally distributed, then no consistent estimate of @* is possible 
because, in this event, the joint distribution of X and Y is also normal, and is 
determined by moments of the first two orders. Since these moments are con- 
sistent with an infinity of different values of 6*, the latter is nonidentifiable. 
Between 1936 and the appearance of the paper by Wald in 1940 several studies 
were published, of which we will mention one by R. G. D. Allen [4], adding more 
precision to the facts just described. 

Wald’s paper brought a new idea into the situation. Namely, in certain cases 
something may be known about the particular values assumed by the unobserva- 
ble random variable —. When this condition obtains, a method due to Wald 
gives a consistent estimate of 6*. This estimate is again based on the arithmetic 
means of the observations on X and Y, appropriately grouped. Wald’s idea 
took root and led to the paper by Housner and Brennan [5]. The same idea, a 
little more developed, is at the base of papers by Berkson [6] and by Hemelrijk 
[7]. However, important as these developments may be in various fields of 
application, it is obvious that they do not constitute a solution of the original 
problem of estimating 6* when no knowledge of the particular values assumed 
by the unobservable random variables is postulated [8]. 

A new era in the study of the problem began following the result of Reiers¢l 
[9}’ who proved that the case of nonidentifiability of @* noted in 1936 is an ex- 
ception rather than a rule. This discovery stimulated the paper by Scott [10] 
giving a consistent estimate of 6* applicable in a new category of cases, when 
no information on the particular values of £ is postulated. However, the con- 
sistency of the estimate of Scott depends on the existence of a certain number 
of moments of the variable £. 

The present paper is concerned with the case where the errors of measurement 
may be split into two components 


? Although this paper appeared in print in 1950, the author became acquainted with it in 
the spring of 1948 from a lecture delivered by Reiersgl in a seminar meeting at the Statistical 
Laboratory, University of California, Berkeley. 


, 





LINEAR STRUCTURAL RELATION 


(4) / = Ui + Us, 

’ V=Vit+ Ve ’ 

where U; and V; are mutually independent and, as a pair, are independent of 
(Ue, V2), and where U; and V, follow an arbitrary normal distribution. With 
the exception of the above independence, no restriction is placed on the distribu- 
tions of U, and V, . The purpose of the paper is to give an explicit construction 
of an estimate of a parameter @ (closely allied to but not identical with 6*) 
which remains consistent in the most general case of identifiability, that is 
when £ and 7 follow an arbitrary nonnormal distribution. No knowledge of 
particular values of £ is postulated. 

Since the above hypotheses admit the possibility that X and Y have no 
moments at all, the conventional methods of constructing the estimate have to 
be abandoned. Essentially, the estimate is defined as the abscissa which corres- 
ponds to the minimum ordinate of a point on a random curve. A search for this 
minimum among the roots of the derivative may be embarrassing. In fact, the 
derivative need not exist at all points. Therefore, the estimate is defined as the 
outcome of a specially devised interpolation procedure. The proof is based on a 
lemma which seems to have an interest of its own and may be applicable in 
other cases. 


2. Concepts of identifiability and of consistent estimability. In order to define 
the concepts of identifiability’ and of consistent estimability, we shall consider a 
variable point # (parameter) capable of assuming any one of a set s of positions 
3’. Every #’c s will be described as a possible value of 3. For every #’ ¢ s consider 
a specified set w(’) of distribution functions and let w stand for the union of 
all w(8’) for 0 € s. 

DEFINITION 1. We shall say that the parameter 3 is identifiable in w if, whatever 
3’ © s and whatever 3" ¢ s, 3’ 4 8’, the corresponding sets w(8’) and w(8"’) have no 
elements in common. 

If 3 is identifiable in w, then to every distribution function F ¢ w there corres- 
ponds a uniquely defined value of 3, say 3(F) « s. 

From now on we shall restrict ourselves to sets w of distribution functions F 
defined in the same Euclidean space of a fixed number m of dimensions. For 
every F ¢ w we shall consider an m-dimensional random variable X(F) whose 
distribution function is F. For n = 1, 2, --- the symbol Y,(F) will denote the 
set of n completely independent observations made on X(F). Thus, Y,(F) may 
be considered as a random variable of dimensionality mn. Let y, denote a point 
in the mn-dimensioned Euclidean space R,»,». Consider a sequence of Borel 
measurable functions {7,(yn)}, each from Rm, to s. Obviously, the result 
T.(Y,(F)) of substituting Y,(F) for y, in T,(y.) is a random variable. 

DEFINITION 2. If the parameter 3 is identifiable in w and if, whatever be F ¢ w, 


’ Important discussion of this concept, in a slightly different form, is due to Koopmans 
and Reiersgl [11]. This paper contains a substantial bibliography. 





500 JERZY NEYMAN 


the sequence {T,(Y,(F))} converges in probability to 3(F) as n — o, then this 
sequence is called a consistent estimate of 3 in w. 

DEFINITION 3. Jf the parameter 3 is identifiable in w and if there exists a con- 
sistent estimate of 3 in w, then we shall say that 3 is consistently estimable in w. 


3. Identifiability of the directional parameter in the linear structural relation 
of two random variables. Returning to the general situation described in Section 
1, denote by 6 the parameter defined as follows: 


if —3" < 0* < 4x, then 0 = 6, 
if 6* = 4, then 6 = 0. 


The parameter @ thus defined will be called the directional parameter of the 
structural relation (2). 

Denote by S the set of possible values of 6, —34 < @ < $m. For every value 
3 of this set we shall now define a set 2(8) of joint distributions of the variables 
X and Y of formulae (3). We begin by defining (0). 

If 6 = O then either 6* = O or 6* = 43x. Accordingly, 2(0) is defined as the 
union of the two sets of distributions, 2*(0) and 2*(42), each corresponding to a 
particular value of 6*. If 6* = 0 then formula (2) implies that &— is degenerate 
and — = p. Assume the following hypotheses: 

(a) The variable n has an arbitrary distribution. 

(b) U = U, + Uzand V = V, + V2, where U; and V, are mutually inde- 
pendent, as a pair are independent of & and 7 but otherwise arbitrarily dis- 
tributed, and where (U;, V2) represent a pair of arbitrary normal variables, 
independent of the triplet 7, U; , V; . In particular, U; and V2 may be correlated. 

(c) -x~ < p< +o. 

Obviously, every specific set of hypotheses regarding n, Ui, Vi, U2, Ve and p 
implies a specific distribution of the pair X, Y. Then 2*(0) denotes the set of 
all such distributions. 

In order to define 2*(47) we notice that, if 6* = 3 then (2) implies that 7 
is degenerate and n = p. 2*(47) is defined to contain every joint distribution of 
X and Y implied by an arbitrary assumption regarding the distribution of 
and by hypotheses (b) and (c). 

As mentioned Q(0) is the union of 2*(0) and 0*($7). However, the reader will 
verify easily that the sets 2*(0) and 2*(4) coincide. Therefore 6* is not identifi- 
able in 2(0). 

For every possible value 3 of 6, other than 3 = 0, the set (2) is defined to 
contain every joint distribution of X and Y defined by formulae (3), implied 
by the assumption that & follows an arbitrary nondegenerate, nonnormal dis- 
tribution, that 7 is connected with & by equation (2) with p having an arbitrary 
real value, and that the errors U and V are arbitrarily distributed, subject to 
condition (b). It will be seen that the equality 6 = 0 characterizes the case 
where at least one of the variables — and 7 is degenerate so that, instead of being 
linearly connected, these variables may be considered as mutually independent. 





LINEAR STRUCTURAL RELATION 501 


Reiers¢! proved [9] that the parameter @ is identifiable in the set 2 of distribu- 
tions of X and Y defined as the union of all sets 2(#) for —44 < 3 < 4. Since 
it is known that the restriction of nonnormality imposed on £ and n when # # 0 
cannot be relaxed without destroying the identifiability of @, it follows that Q 
is the broadest set of joint distributions of X and Y within which @ is identifiable, 
consistent with the assumption that the errors of measurement U and V satisfy 
assumption (b). The purpose of the present paper is to provide an explicit 
construction of an estimate of @ consistent in Q. 


4. A few preliminaries. It will be convenient to use the concept of uniform 
convergence in probability. Let G(x) denote a function defined over a non- 
degenerate closed interval x ¢ [a, b]. Further, let {Z,} be a infinite sequence of 
random variables and {G,(Z,, x)} a sequence of functions of two arguments 
Z, and zx. Each G,(Z, , x) is assumed to be defined for every x « [a, b] and for 
every possible value of the random variable Z, . Furthermore, when z< is fixed, 
G,(Z, , 2) is a Borel measurable function of Z, . Thus, it is a random variable. 

DEFINITION 4. We shall say that the sequence {G,(Z, , x)} of random functions 
converges in probability to G(x) uniformly in [a, b), if there exists a function m(n) 
defined for alln = 1, 2, --- such that 
(5) lim m(n) = « 


no 


and such that, whatever « > 9, 


(6) lim (m(n) sup P{|G,(Z,,x) — G(x) | > }) = 0. 


no ze(a.b] 


Every function m(n) satisfying the above conditions will be described as the 
norm of uniform convergence of {G,(Z, , x)}. Obviously, it may always be as- 
sumed that the norm m(n) assumes only positive integer values. 

In order to illustrate this concept, assume that for every xz € [a, b] and for 
every n = 1, 2, --- we have 
(7) E(G,(Z.,, x)| = G(x) 
and that the variance o;,(x) of G,(Z, , x) is bounded by 


1 
n 


(8) ox(x) 


2 
To; 


where a > 0 is a constant. Using the inequality of Bienaymé-Tchebycheff we 
may write 

; ? (x) 
(9) P{|G.(Z.,2) —-Giz)| > <S=s 


9 


for every x ¢ {a, b]. Thus 


(10) sup P{ |G,(Z,, 2) — G(x)| > e} 


ze[a.b} 





502 JERZY NEYMAN 


and it is seen that, under conditions (7) and (8), the sequence {G,(Z,, x)} 
converges in probability to G(x) uniformly in [a, b]. For example, the norm of 
uniform convergence may be defined as the greatest integer not exceeding the 
square root of n, 


(11) m(n) = [Vn]. 


Another convenient concept will be described as the m-lattice minimal point 
of a function f. This is defined as follows. Let [a, b] denote a nondegenerate 
closed interval and f(z) a real function defined on z ¢ [a, 6]. Let m be an arbitrary 
integer m > 1 and 

b-—a ; 
(12) Ont = ‘+ io for k=0,1,+--,;m—1. 
We shall say that the m points a,,, form the m-lattice on [a, b]. Now consider the 
values f(a.) of f(x) corresponding to the points of the m-lattice and use the 
symbol f,, to denote the smallest of these. In general, there will be r points of 
the lattice, say 


(13) Omky < Omkg < eee < mk, 


such that f(@ni,) = fm. Let w = [(r + 1)/2]. The point a,,, will be described as 
the m-lattice minimal point of the function f(x). It will be denoted by M,,(f(x)). 

FUNDAMENTAL Lemma. Jf the real function G(x) is defined and continuous on a 
nondegenerate closed interval [a, b] in which it has an absolute minimum G(x») 
attained at a single point x , if {Zn} 1s a sequence of random variables and if the 
sequence of real random functions |G,(Z,, x)} converges to G(x) uniformly in 
fa, b] with an integer valued norm m(n), then the sequence {Mamm[Ga(Zn, x)]} of 
m(n)-lattice minimal points of G,(Z,, x) converges in probability to xo . 

Proor. Assume that the conditions of the lemma are satisfied. The proof 
consists in showing that, whatever e > 0 and » > 0, a number N(e, 7) can be 
found such that the inequality n > N(e, ) implies 


(14) P{| MucnlG@n(Zn, Z)] — to | > €} <. 


Fix ¢ and » and denote by g the minimum value of G(x) attained in the part of 


la, b} outside of the open interval | « — a! < ¢. Obviously g > G(a). Let 6 < « 
be a sufficiently small positive number such that | « — x) | < 6 implies 


(15) G(x) S G(x) < G(x) + 4(g — G(2)). 
Denote by N, the smallest integer such that n > N, implies 


(16) ee. 
m(n) — 1 


and by N» the smallest integer such that n > N2 implies 


(17) m(n) sup P{'G,(Z,,x2) — G(x)| > 3(g — G(x))} < n. 


ela 





LINEAR STRUCTURAL RELATION 503 


Finally, let N(e, 7) = max (N,, Ns). It is easy to see that for n > N(e, 7), 
the inequality (14) is satisfied. We notice first that with n > N(e, ») = Ni, the 
interval (x — 5, x + 5) will include some points of the m(n)-lattice. Further, 
in order that | Main)[Gn(Z,, Z)] — zo| > € it is necessary that at least one of 
the values of G,(Z,, x) assumed at points of the m(n)-lattice outside of the 
interval (x — €, % + €) not exceeded any of the values assumed by this func- 


tion on the m(n)-lattice within (zp — 5, z» + 5). But outside of (x) — «€, ro + €) 
we have 


(18) G(m%) <g S G(x) 
and inside of (x — 6, zo + 4) 


(19) G(x) < G(x) + 3(g — G(a)). 


It follows that, if at each point of the m/(n)-lattice the random function 
G,(Z, , Gmx) differs from G(anx) by at most 4(g — G(x)), then 


| MmnlGa(Zn, Z)] — to| S €. 


Thus, the probability that | Maw[G,(Z,, z)] — zo0| > € is at most equal to 
the probability, say +, that for at least one point a,x of the m/(n)-lattice 
|Gi(Zn, Ome) — Glam) | > 3(g — G(xe)). However, 


m(n)—1 


rs a P{| Ga(Zny Gna) — G(ame) | > 3(9 — G(x0))} 


(20) 
Sm(n) sup P{|G,(Z,, 2) — G(x)| > 4g — Glx))} <9 


zre(a,b] 


because of (17), and the proof of the lemma is completed. 


5. Consistent estimates of the directional parameter of a linéar structural 
relation between two variables. We return to the problem of the consistent 
estimation of the directional parameter @ of the structural relation (2). The 
parameter 6 was defined in Section 3. Also it will be assumed that the joint 
distribution function F of the variables X and Y belongs to the set 2 defined in 
Section 3. Consider a set of N(n) = 8n independent observations to be made on 
the pair of variables X and Y. These observations will be divided into n eight- 
tuples and denoted by (X;;, Yi;) fori = 1, 2, ---,n andj = 1, 2,---, 8. 
The ith eight-tuple will be denoted by Z? . The totality of n eight-tuples will be 
denoted by Z, . 

In defining the estimate of @ we shall need three (identical or different) prob- 
ability density functions w;(x), w(x), and w;(x), and their characteristic func- 
tions, say ®,(t), ®.(¢), and #,(f), respectively. These probability density func- 
tions can be selected arbitrarily out of a class [ which we shall define by the 
following conditions: every w(x) ¢ T is symmetric about zero, w(—xz) = w/(z), 
and there exists a positive number a such that w(x) > 0 for every |x| < a. 





504 JERZY NEYMAN 


It will be observed that the symmetry of w;,(x) implies that the corresponding 
characteristic function ,(t) is real. 

Speaking in terms of the characteristic functions , , 2 , ®; , we shall define 
a class C of consistent estimates of 6. Any particular choice of the functions 
, , &, and ®; will determine a particular estimate of the class C. For example, 
we may choose to consider the following probability densities of class T: (1) 
the normal probability density with zero mean and unit variance, (2) the Cauchy 
probability density with unit scale and zero location parameter, and (3) the 
rectangular probability density between —a and +-a. Each of the corresponding 
characteristic functions, exp {—3¢}, exp {— |¢|}, and sin at/at, respectively, 
may be taken to represent either , or #. or 3; , or any two, or all three 4; = 
P, = ®; ° 

Assume that the choice of the functions ,(¢) is made. Denote by # an arbi- 
trary number between the limits —}7 < & S +r. For the kth eight-tuple of 
observations define the following symbols 


A(Zi, v) = ©,/(Xi. - Xx2 Se Xisz — Xx) cos 3 
| + (Yair — Yee Yes — Yous) sin d)b.( Xx — Xia Xis — Xie), 
(21) {B(Zr) = (Yn — Yio + Yer — Xs), 


Jews) = q; | Yur — V x4 - Y x6 = V;2), 


D(Zr) = (Vis — Yea + Yes — Yew); 
(22) H(Z , 8) = A(Ze , 8){B(Ze) — 2C(Zz) + D(Z)}. 
Finally, let 


rT l< “st 
(23) G.(Zn, 0) = — >, H(Zy, 8). 
N kal 
Put m(n) = [Vn] and consider the m(n)-lattice on the closed interval 
+ . rp er . ’ re! ° . 
—4n, +42]. For every fixed value Z, of Z, we consider G,(Z, , 3) as a function 
of 3 ¢ [—4x, +42] and then Mayny(Ga(Zn , &)) will denote its m(n)-lattice 
minimal point. After these preliminaries we define the estimate 7,,(Z,) of 6 
as follows. 
G) If G,(Z,,0) Ss z=, then T,(Z,) = 0. 


Vi 
(24) 


(ii) If G,(Z,,0) > : =, then T7,(Zn) = MinwlGa(Zn, 9)). 
Vn 
THEOREM. The sequence |T,(Z,)} represents an estimate of 8 consistent in Q. 
Proor. We begin by noticing that, since the symbols in (21) are defined in 
terms of characteristic functions, their absolute values cannot exceed unity. 
Therefore, 


(25) H(Z , 8)| S 4, 





LINEAR STRUCTURAL RELATION 505 


and thus all moments of H(Zr , #) exist. In particular, we shall be interested in 
the first moment, say 


(26) E\H(Z , 8)| = E{G,(Z,, 8)} = Gd), 


and in the variance, say o(#), of H(Z; , 8). Obviously, o°(8) < 16. Since the 

successive variables H(Z; , 8) are completely independent, the variance of 
G,(Z,, 8), say o3(8), is 

2 ’ 

. 2 ob If 

(27) o3(8) = a (9) = 

n n 


and it follows that the sequence {G,(Z, , #)} converges in probability to G(#) 
uniformly in [—42, $7]. As we have seen before (see Section 4) the function 
m(n) = [+/n| may be taken as the norm of the uniform convergence. 

Our next step in the proof consists in showing that the function G(#) has the 
following properties. 

(A) If the random variables X and Y follow a distribution F ¢ 2 such that 
6(F) = 0, then G(8) = 0 for all & ¢ [—4x, $x] including ¢ = 0. 

(B) If 6(F) ¥ 0, then G(8) > 0 for all & ¢ [—42, 4x] with the exception of 
& = 6(F) where G[@(F)] = 0. 

(C) G(8) is continuous for & ¢ [— 4x, 4]. 

Once these three properties of G(#) are established, the proof of the theorem 
is completed as follows. Assume first that 0(F) = 0. Then, by the theorem of 
Bienaymé-Tchebycheff, 


PY Gr(Zn, 0) < > at \Ga(Zn,0)| < 


\ 


».) 
Vas 
28 
(28) 16 


1 — o4(0 >1--—. 
> ool )/n = Jn 
The definition of 7’,(Z,) implies that it is equal to zero whenever 
1 
(29) G,(Z,,0) S Tr? unconditionally, 


and also whenever 


1 
(30) G.(Z,, 0) > Tn and Mam lG.(Z,, 8)] = 0. 


Consequently, the probability 


7 16 
31) P{T,(Z,) = 0} = P<G,(Z,,0) S zw=>) 21-—- = 
( t ( j= L In n? “ n V/n 
and tends to unity asn > ©. 
Assume now that 6(F) ~ 0. According to the fundamental lemma, in this 
case M mn)(Gn(Z, , 8)) converges in probability to 6(F). To prove that the same 





506 JERZY NEYMAN 


is true for 7,,(Z,) it is sufficient to show that the probability P{T,(Z,) ¥ 
M min[Gu(Zn , 8)}} tends to zero as n — &. Obviously this last probability does 
not exceed the probability that G,(Z,, 0) S n-*. According to property (B) 
we have G(0) > 0 in the case considered. When n > G(0)~*, we have 

1 


f rer L 
ry G,(Zns 0) < 47-7 < r 5 | G,(Z,; 0) —_ G(0) > G(0) = A 
V1 vn 


\ 


32 
63) 16 


> 
n (co) Ti) 

and it follows that, as n — «, the probability that 7,,(Z,) will coincide with 
M min)(Gn(Zn , &)) tends to unity. It is seen that the properties (A), (B), and 
(C) of the function G(#) combined with (26) imply that, whatever F ¢ Q, the 
estimate {7,,(Z,)} converges in probability to @(F) or, in other words, that 
T,,(Z,) is an estimate of @ consistent in 2. Therefore, in order to prove the theo- 
rem, we shall establish that the expectation (26) has the properties (A), (B), 
and (C). This will be done in Section 6 in the following order. First we shall 
use the postulated properties of the observable random variables X and Y and 
define a function G(#) having the properties (A), (B), and (C). Next we shall 
show that the function G(#) so defined coincides with the expectation (26). 


< 


6. Structural definition of G(3). The structural definition of G(#8) is based on 
the properties of the characteristic function, say $(t; , 2), of the joint distribu- 
tion of X and Y. According to the usual definition 


(33) o(t;, ) = Ele * ti"), 
where 
X=§&+U,+ U2, 
Y=n+Vit+Me. 


Assume first that 6 = 0. In this case the components § + U, and n + V;, are 
mutually independent and the possible dependence of X and Y will be due to 
the correlation that may exist between the normal components of errors U2 
and V, . Since the logarithm of the characteristic function of two normal variables 
is a polynomial of the second order, when @ = 0 the characteristic function of 
X and Y has the form, say 


(34) 


(35) o(th , to 6=0) = evil tvs to wees 


where y;(¢;) is a function of ¢; alone, i = 1, 2. We note this form of (t; , & | 6 = 0) 
for future reference and proceed to the next case, where 6 ¥ 0. 
In this case 6 = 6* and the structural relation (2) may be solved with respect to 


) 
(36) 7 = _! — ~ cot 6 
sin 7 





LINEAR STRUCTURAL RELATION 507 


Substituting this expression into (34) and denoting the logarithm of the charac- 
teristic function of — by x(é), we have 

(37) o(t; : t:) = Oe Te 
— the symbols y¥; and y¥2 are again used to denote functions of one argument 
only, either ¢; or & . These functions in (37) have a meaning different from that 
in (35). However, this difference is of no importance because in both cases the 
essential point is that y, depends on ¢, but not on é& and that 2 depends oi) fz 
but not on ¢,. It will be convenient to consider that @(¢; , 2) always has the 
form (37) with the understanding that, when 6 = 0, then x(t) = 0. 

Since y(t), yo(t) and x(t) are defined in terms of logarithms of characteristic 
functions, they vanish at ¢ = 0 and are continuous at this point. In addition, 
we shall use the following important property of x(t). This is that, whenever 
6(F) ~ 0, then however small 6 > 0, the function x(t) cannot coincide with a 
polynomial of second order on the whole of the interval (—4é, 6). This property 
is implied by the hypothesis that, whenever 6 ~ 0 and therefore x(t) # 0, then 
£ is not normally distributed. In fact, assume that there exists a positive number 
8* such that x(t) = a + bt + cf for all |t| < 8*. It is easy to see that in this 
case all the derivatives of the characteristic function of & would exist at ¢ = 0 
and would determine all the moments of £. Furthermore, these moments would 
coincide with the moments of a normal distribution, from which it would follow 
that é itself is normally distributed, contrary to the hypothesis. Thus it follows 
that, if x(t) coincides with a quadratic in ¢ over an interval, this interval cannot 
include t = 0. 


Select a number 3 ¢ [—4zx, 42] and three arbitrary real numbers ¢, 7, 72. 
We shall consider $(¢; , f2) at the following eight points which, to abbreviate the 
formulae, will be denoted by lower case Roman numerals. Thus, for example, 
¢(i) will denote the value of $(¢, , 4) evaluated at the first of the eight points. 
The coordinates of the first four points are 


(i) th = tecosd +74, t=tsnd?+n, 
(ii) th = teosd + 7, to = tsin d, 
(iii) ti t cos 8, t = tsind + re, 
(iv) t, = tcos #, to = t sin d. 


The coordinates of points (v) through (viii) are obtained from those of (i) to 
(iv), respectively, by substituting ¢ = 0. Thus 


(v) h Ti, to = 72, 
(vi) ty 
(vii) ty 


(viii) ty 


POE CPM AIRE GE IS AE RB | Ou 


peice 





508 JERZY NEYMAN 


Obviously ¢(viii) = 1. Now we form the function 
(38) h(8, t, 71 , 72) = HI O(iv)O(Vi)O(vii) — (ii) (iii) O(v)O(viii). 
Easy algebra gives 
(39) h(d, t, 71, T2) = VV. — VV, 
where 
= exp j¥(t cos } + 71) + Wilt cos 8) + Y(71) 
+ yo(t sin 3d + 72) + Yo(t sin 8) + Yo(72) 
+ y[(t cos 8 + 7,)(t sin 8 + 72) + f cos # sin 9}, 
= exp {x(At+ 7 — 72 cot 0) + x(At) + x(71) + x(—72 cot 4)}, 
exp {x(At + 71) + x(0) + x(At — 72 cot 6) + x(71 — 72 cot 4)}, 
with 


(41) eo sin (6 — 8) 


sind 
For any x > 0 we shall use the symbol a(x) to denote the set of triplets (¢, 7; , 72) 
such that |¢| < xz, 7:| < 2 and |72| < x. Because of the properties of the 
functions y; , ¥2, and x there exists a positive number 6 such that within o(6) 
the functions ¥, and ¥; do not vanish. Consequently, for (¢, 71, 72) € o(6) we 
may write 


; VY» 
h(d, t, T1, 72) = Vi; (= —_ 1) 


(42) WiWs( exp{lx(At + 7; — 72 cot 0) — x(At + 7) 


— x(At — 72 cot 6) + x(At)] 


Ix(71 — tT. cot 0) — x(71) — x(—72 cot 6) + x(0)}} — 1), 


The idea of the function hA(J, t, 71, 72) originated from the paper of Reiersgl 
and this function is the key to the whole construction of the estimate 7,(Zn). 
The function h(#, ¢, 7; , 72) is defined as a combination of values of the character- 
istic function of the observable random variables X and Y at eight arbitrarily 
selected points. Consequently, the definition of h(8, t, 71, 72) is independent of 
the value of 6(F). However, the properties of h(#8, t, 7; , 72) do depend on @(F), 
as follows. 

(a) If 0(F) = 0, then A(d, t, 71, 72) = O for all values of the four arguments 
8 ¢ [—}n, x) and —*% <t,71,72 < +2. 

(b) If 0(F) ¥ Oand 8 = @(F), then h(¥, t, 7; , r2) = O for all combinations of 
values of t, 11, 7, —*% <t, 7m, 72 << +. 


~ 





LINEAR STRUCTURAL RELATION 509 


(c) If 0(F) ¥ 0 and & ¥ 6(F) then, whatever 6, > 0, the cube o(6,) contains 
a subset of points (t, 7; , 72) of positive three-dimensional measure within which 
h(d, t, 1, 72) ¥ 0. 

In order to prove (a) we notice that the case 6(F) = 0 is characterized by the 
identity x(t) = 0. Making this substitution in (42) it is immediately seen that, 
in this particular case, h(#, t, 71, t2) = O for all combinations of values of the 
four arguments. 

In order to prove (b) we notice that # = 6(F) implies 


9) 4-82 0= 0 _ 
sin 6 

Then (42) implies that h(6, t, 71, 72) = 0 for all combinations of values of the 

three arguments ¢, 7; , T2. 

In proving (c) we shall use the hypothesis that ¢ is not a normal variable and 
that, therefore, however small 6 > 0, the function x(¢) cannot coincide with a 
polynomial of second order on the whole of the interval |¢| < 6. Assume that 
the assertion (c) is not true and that, with 8 ~ 6(F) = 0, there exists a positive 
number 6* such that, for (t, 7; , t2) € o(6*) we have identically h(#, t, 7; , 72) = 0. 
Then this identity will also hold for all sufficiently small | ¢ | and | 7; | and 


(44) T = —7, tan 0. 


Within the common part of o(5) and o(6*) the functions V; and W; do not vanish. 
Therefore, we must conclude that the result of substituting (44) into V2 and 
WV; must give ¥./¥; = 1 for all sufficiently small | ¢| and | 7; |. This however, 
implies that 


(45) x(At + 271) — 2x(At + 11) + x(At) = x(271) — 2x(71) + x(0). 


It will be seen that the expressions on both sides of this identity represent second 
differences of x(t) at steps 7; evaluated at points At and zero, respectively. Thus, 
the assumption A(v, t, 71, 72) = O in (t, 71, 72) € o(5*) leads to the conclusion 
that there must exist a certain vicinity W of the point ¢ = 0 where, however 
small | 7; | , the second difference of the function x(t) computed at steps 7; has 
a vahie possibly depending on 7; but not on the point at which it is evaluated. 
Since x(t) is continuous, it must then coincide with a polynomial of second order 
in ¢ over the whole interval W. This, however, is contrary to the hypothesis. 
Therefore, if * ~ 6(F) # 0, whatever be 6 > O the cube o(6) must contain at 
least one point ¢’, 71, ro such that A(d, t’, ri , t2) ¥ 0. Since h is continuous in 
(t, 7, , 72) it then follows that o(6) must contain a set of three-dimensional posi- 
tive measure where h(#, t, 7; , 2) * 0. This establishes (ce). 

When A(#, t, 7: , 72) * 0, it may be represented by a real or by a complex 
number. It is known that by changing the signs of the arguments of any char- 
acteristic function one obtains a value which is conjugate to the original value 
of this characteristic function. It is easy to see that the same applies to 
h(8, t, 71 , 72). Therefore, the product 


0. 


(46) h(d, t, Tis to)h(8, on§, ras —T2) = | h(d, t, Tl, T2) | t = g(d, t, Tiy T2), 





510 JERZY NEYMAN 


say, is equal to the square of the modulus of h(#, t, 7; , 72). It follows from the 
preceding that the function g(0, ¢, 7: , 72) is real valued, nonnegative and con- 
tinuous in (¢, 71 , 72). Also, it is easy to see that g(%, t, 7: , r2) cannot be greater 
than 4. Furthermore, if #(F) = 0 then g is identically zero. Also, it is zero iden- 
tically in ¢, 7; , 72 if 0(F) ¥ 0 but 3 = A(F). 

On the other hand, if 0(F) + 0 and 8 ¥ @(F), then in every vicinity of ¢ = 
1 = = (0 there is a set of positive three-dimensional measure where 
q(d, t, 71 , T2) > 0. Now, let w;(x), we(x) and w;(x) be three (identical or different) 
probability density functions of class T (that is, each symmetric about r = 0 
and nonvanishing in a nondegenerate interval | x | < a). Also, let 


(47) G(s) = [ w(t) [ We(7;) [ w3(T2)g(d, t, 71, T2) dt dry dre. 
It is obvious that, whatever the chosen probability density functions w; , w» , 
and 3, . 

if 0(F) = 0, then G(d) = 0 for every 3d ¢ [—4x, 4x], 


if 0(F) ~ Oand & = @(F), then G(3) = 0, 
if 0(F) ¥ Oand 3 ¥ 6(F), then G(#) > 0. 


Also, because of the definition of g(#, t, 71, 72) in terms of the characteristic 
function of X and Y, G(#) is a continuous function of 8. It follows that the 
function G defined in formula (47) possesses the properties (A), (B), and (C) 
mentioned at an earlier stage of the proof of the theorem (Section 5). In order 
to complete this proof, we now show that, if #,(¢) denotes the characteristic 
function of w,(x), k = 1, 2, 3, then the expectation of H(Z> , 8), defined by 
(22) and (21), is equal to G(#) of formula (47), identically in 3 ¢ [—4x, $x]. 

For this purpose we return to the function g(#, ¢, 71 , t2) and reexamine its 
definition (46) in terms of A(#, t, 7; , 72) and ultimately in terms of the charac- 
teristic function ¢(¢, , f2) as in (38). It is seen that g(d, t, 71 , 72) and G(8) may 


be written conveniently as linear combinations of four terms each, say 
(48) G(8) = G(8) — G.(9) — G3(9) + G,(8), 
(49) g(9, t, m1, t2) = gi — G2 — 9s + gs, 


where, for k = 1, 2, 3, 4, 


(50) G.(3) = il 9x W(t) we(71)w3(T2) dt dr, dro, 


and g, stands for the product of from six to eight factors, each factor representing 
the characteristic function of X and Y evaluated at specified values of the two 
arguments. Upon inspecting (46) and (38) the reader will have no difficulty in 
writing down the expressions of the four components g, . To save space we shall 





LINEAR STRUCTURAL RELATION 511 


reproduce only the expression of g; represented by the product of eight factors, 
as follows: 


gn = o(t cos } + 7, ¢ sin & + 72)¢6(—t cos 3 — 7, —t sin & — 72) 
(51) -o(t cos 8, t sin 8)¢(—t cos 8, —Z sin 8) 


*o(71 ’ 0)o(— Ti, 0)¢(0, r2)$(0, _— T2). 


Consider the kth eight-tuple of independent observations on X and Y and let 
(Xi; , Ye;) represent the jth pair of this eight-tuple. Obviously, we may write 


o(t cos & + 71, ¢ sin 3d + 72) 


= Elexp {it(Xin cos } + Yun sin 8) + tr; Xn + i72Yus}], 
o(—t cos } — 7, —tsin & — 72) 
(53) : 
= Elexp {—it(Xxz, cos 8 + VY sin 8) — it; Xxg — it2Yne}], 


etc. Because of the complete independence of all the eight pairs (X;; , Y:;), the 
expression of g; may be written as the expectation of a single exponential, 


g: = Efexp {tt((Xir — Xue + Xis — Xia) cos 9 
(54) + (Yaa — Yeo + Yas — Yuu) sin 8) + t71(Xin — Xo + Xess — Xie) 
+ ire(Yan — Yee + Yer — Yis)}). 


This expectation is just a convenient symbol for an eightfold Stieltjes integral 
with respect to the distribution function F(z; , y;) of each pair (Xx; , Y%;). 
Thus the component G(#) of G(#) is an elevenfold integral. Since this integral 
is absolutely convergent, we may invert the order of integration and write 


+00 
- - al —YotYe-F¥,)8i - 
G,(8) — E([ et 2+X 3—X 4) cos0+(¥ 1—Yot+ VY s—Y¥ 4 rin?) y(t) dt 
oO 


+00 
ity (X )—Xo4+X5—X 
-| erie ress © we(r1) dr 
ep 


+00 
2(¥1-Y Y7-Y 
° [ eit2 3 2+ ®) ws(T2) ars), 
« 


or, remembering the definition of 4, , &, , and 9; , 
G,(8) = E{#,[(X, ~~? Xe + X3 — X,) cos 3 
(56) + (Y, —- Y2 + Y3 —- Y,) sin d| 


-b,(X, - Xe + Xs “si X 6) P3( Y; .* Y2 + Y; - Ys)}, 
or, finally 


(57) G,(8) = E[A(Zt , 8) B(Zr)), 





512 JERZY NEYMAN 


with the symbols A(Z; , 3) and B(Zz) defined for every eight-tuple of com- 
pletely independent observations as in formulae (21). Similarly it is easy to 
show that 


G8) = G0) = E[A(Ziy, 8)C(Zr)I, 
G,(d) = E{A(Z; . 8)D(Zr)). 


(58) 


This, however, implies that 
(59) G(d) = E[H(Zr , 8), 


and the proof of the theorem is completed. 


7. Acknowledgment. The results presented in this paper differ in several 
respects from the contents of the Second Rietz Memorial Lecture of 1949. 
Among other things it was possible to remove a certain restrictiveness of the 
original estimate of 6. The parameter considered in 1949 was not @ itself but 
rather 8 = cot 6. In order to construct the original estimate of 8, it was neces- 
sary to use a number B known to exceed | 8 | . It is a pleasure to acknowledge 
the author’s indebtedness to Professor Charles M. Stein for a useful suggestion 
which led to the present construction of the estimate of 6, independent of any 
advance knowledge of the value of this parameter. 


REFERENCES 
[1] ABRAHAM WALD, ‘‘The fitting of straight lines if both variables are subject to error,”’ 
Annals of Math. Stat., Vol. 11 (1940), pp. 284-300. 
[2] Goprrey H. Tuoomson, ‘‘A hierarchy without a general factor,’ British Jour. Psych., 
Vol. 8 (1916), pp. 271-281. 
[3] J. Neyman, ‘‘Remarks on a paper by E. C. Rhodes,”’ Jour. Roy. Stat. Soc., Vol. 100 
(1937), pp. 50-57. 
[4] R. G. D. Auten, ‘‘The assumptions of linear regression,’’ Economica, N. S., Vol. 6 
(1939), pp. 191-204. 
[5] W. G. Housner anp J. F. Brennan, “‘The estimation of linear trends,’’ Annals of 
Math. Stat., Vol. 19 (1948), pp. 380-388. 
6] JoserH Berkson, ‘‘Are there two regressions?’’ Jour. Am. Stat. Assn., Vol. 45 (1950), 
pp. 164-180. 
J. HemMeEtrisK, “Construction of a confidence region for a line,’’ Nederl. Akad. 
Wetensch., Proc., Vol. 52 (1949), pp. 995-1005. 
| J. NEYMAN AND EvizaBetu L. Scott, ‘‘On certain methods of estimating the linear 
structural relation,’’ Annals of Math. Stat., Vol. 22 (1951), pp. 352-361. 
9] Ovav REtERS@L, ‘‘Identifiability of a linear relation between variables which are sub- 
ject to error,’’ Econometrica, Vol. 18 (1950), pp. 375-389. 
[10] ExizaBetu L. Scort, ‘‘Note on consistent estimates of the linear structural relation 
between two variables,’’ Annals of Math. Stat., Vol. 21 (1950), pp. 284-288. 
[11] T. C. Koopmans anp O. Reters@t, ‘‘The identification of structural characteristics,’’ 
Annals of Math. Stat., Vol. 21 (1950), pp. 165-181. 





TEST CRITERIA FOR HYPOTHESES OF SYMMETRY OF A 
REGRESSION MATRIX! 


By Urram CHanp? 
University of North Carolina and Boston University 


Summary. Hotelling’s [1] theoretical findings in mathematical economics on 
the rational behavior of buyers in maximizing their net profit indicate that the 
matrix of the first partial derivatives of a set of related demand functions would 
be symmetric and negative definite. It is the object of this paper to determine 
whether the assumption of symmetry will be tenable in the light of the particular 
set of observations. The study of test functions for the property of definiteness 
as a whole will form the subject of a forthcoming paper. The present investi- 
gation assumes that the demand functions are regression functions and, there- 
fore, results obtained in this paper do not cover all types of demand functions. 
The test function U proposed here for the hypothesis of symmetry is invariant 
under all contragredient transformations. The distribution of U depends on 
unknown nuisance parameters. The likelihood ratio under the hypothesis of 
symmetry leads to a multilateral matric equation which represents $ p(p + 1) 
equations of the third degree in } p(p + 1) unknown regression coefficients for 
the p-variate case. It has not been possible to establish the existence of a non- 
trivial solution of this equation, and it is, therefore, not being given here. 


1. Introduction. Let p; denote the price of the ith commodity and q; the 
quantity consumed at that price. Consider p; = fi(qi , g2, --- ) a set of demand 
functions and let u = u(q:, ge, --: ) represent the gross receipts of a purchaser 
of goods. Under the assumption that each entrepreneur tries to maximize his 
net profit * = u — Yp.q; , Hotelling [1] in an important contribution concerning 
the theoretical nature of supply and demand functions showed that if the 
entrepreneur is working in a steady economic state in which there is no re- 
striction on his money expenditure, then the matrix of the first partial deriva- 
tives of prices on quantities would be symmetric, that is, 

Opi _ OD; 


0q; 99: 


and that for a true maximum such a matrix would be negative definite, that is, 


Op; a(p:, pj) api, Pi» Pr) 
— < ——_—— > 0 = , 
qi 8(qi, 95) 9(9i, Vi Me) 


1 This paper was presented at the Cleveland meeting of the Institute on December 27, 
1948. 

? The author wishes to express his grateful appreciation to Professors Harold Hotelling 
and William G. Madow for guidance in this research. 


513 





514 UTTAM CHAND 


It would thus appear that the inequalities arising out of the negative definiteness 
of the matrix generalize the conditions that a demand curve shall decline. 

No suitable statistical tests have existed for testing the hypothesis of sym- 
metry and negative definiteness of the matrix referred to in the previous par- 
agraph. Henry Schultz [2] was first to consider such a question and the present 
paper has grown out of his statistical attempts. To verify Hotelling’s laws on the 
basis of a particular set of data consider 


(1.1) Di = flr, d,***) + Ui, 


a system of demand equations where p; and q; denote current prices and quantities 
and where u; is a stochastic variable. We shall assume that the quantities are 
fixed and prices are determined by demand. For example, some government 
agency could conduct actual experiments fixing alternative sets of quantities 
and.observing what prices the choice of buyers would lead to. In such a situation 
we shall, therefore, be justified in assuming demand functions to be regression 
functions. In general the quantities are determined by a certain type of supply 
function under the prevailing market mechanism. Suppose the supply func- 
tions are given by 


(1.2) Pi = Qu, @,°°° ) +, 


where »v; is a stochastic variable and u’s and v’s have a more or less specified joint 
probability law. If (1.1) and (1.2) are to hold simultaneously their solutions, if 
they exist, will be the only observable values of prices and quantities; and, there- 
fore, quantities such as 0q;/dP; cannot in general be estimated and consequently 
no question of testing symmetry could be raised. However we could conceive of a 
different type of supply functions from those in (1.2) containing other inde- 
pendently determined variables besides the p’s and being of such a stochastic 
type that the equations (1.1) would be regression equations [12]. For the purpose 
of this investigation we shall assume that the demand equations are regression 
equations such that the mathematical expectation of p; is equal to f; and since 
not all demand functions are regression equations, the results of the present 
investigation are not applicable to all types of demand equations. 

Since we are studying certain properties of correlated variables any proposed 
statistical criterion must satisfy the property of invariance under linear trans- 
formations of prices and quantities. The fact of the transformation of quantities 
being not independent of that of prices will further restrict us to the consider- 
ation of such relations as are invariant under a linear transformation of one set 
of variates contragredient to those of the other ({3], pp. 108-109). The importance 
of such a class of relations was first suggested by Hotelling in a series of papers 
[4], [5], [6]. Examples of such a “value preserving” class of transformations may 
be found in the mixing of different grades of wheat or the combination of raw 
materials and labor into finished products such that the total value remains 
unchanged. 


The statistic U (Section 4) proposed here for the hypothesis of symmetry for 





HYPOTHESES OF SYMMETRY 515 


the case of two related commodities is invariant under all contragredient trans- 
formations. It is exact in the sense that its probability distribution law is pre- 
cisely determined under the hypothesis. Certain practically useful relations be- 
tween this statistic and Student’s ¢ will be indicated. This test has in addition 
the property of being an unbiased test in the sense of Neyman and Pearson. We 
consider its p-variate generalization in Section 4.4. 


2. Probability model. Let Y = || yi || be a p X N sample matrix from a 
normal multivariate parent having ¢ = || o;; || as the dispersion matrix and 
n = || nia || = BX as the corresponding matrix of expectations where 8 = || §;; |! 
is the population regression matrix and X = || 2,q || is the matrix of nonrandom 
observations on the fixed variates (e.g., y:1,--*-, Yp may denote prices and 
%1,°**, £» the quantities consumed at these prices). Let g = || gi; || = XY’, 
where gi; = La Zia¥ja2 and where Y’ is the transpose of Y. Set a = XX’ and 
ce = |le;||] = @’ G,j = 1,---, pja = 1,---,N;p < N). We shall assume 
without loss of generality that y,;.’s and 2;.’s are either measured from their 
means or from polynomial means if the y’s are subject to a time trend. It will 
be noticed that the symmetry and definiteness of the matrix of partial deriva- 
tives of prices on quantities is equivalent to the symmetry and definiteness of 
the regression matrix £. 


3. Contragredient transformation of the two sets of variates. Let f = || fi; || be 


7 1) 
ap X p nonsingular matrix and let the columns z of the matrix X be subjected 


to the transformation f; we write x* = fx. If the columns y are transformed into 
columns y* in such a way that y*’x* = y’x for every zx and y, then the trans- 
formation of the y’s is uniquely determined, viz., y* = f’‘y. We say, under these 
circumstances, that the columns z on the one hand, and the columns y on the 
other hand, are transformed contragrediently under f. For the mathematical 
expectation of y*’s we have E(y*) = 8*z* where 6* = f’ Bf’. Consequently 
B*’ = B* implies 8’ = 6 and conversely. Thus we notice that the symmetry of 
the matrix 8 is preserved by this type of transformation. Since the property of 
definiteness is invariant under any nonsingular linear transformation, the 
hypotheses of symmetry and definiteness are invariant and we might as well 
consider the properties of the matrix @*. If we denote by o* = || 04; || 
the covariance matrix of the y*’s, we have o* = f’ of and consequently the 
ratio of the determinants | 8| and | oc | is an absolute invariant. We now state 
the following theorem: 

TueoreoM I. If o is a positive definite matrix and 8 a real symmetric matrix and 
the two are cogrediently transformed, there exists a nonsingular linear transform- 
ation which will reduce o to an identity matrix and B to a diagonal matrix. 

Proor. We have 8 = f’8*f and o = f’o*f and the proof follows from a theorem 
given in [3] (p. 171). We shall make use of this result in Sections 4 and 5. 


4. Hypothesis of symmetry of the regression matrix $. 
4.1. The statistic U. We shall show that for the bivariate case the statistic U 





516 UTTAM CHAND 


now to be presently defined provides an exact and unbiased test for Hy : Bi: = Ba 
against the set of alternatives which do not specify anything except Bi ~ Ba. 
Consider 


t 2 ‘ -1 
U = (by - bor)" (C22811 + C1822 — 212812), 


where 

(i) The sample regression coefficients b;; are normally distributed with means 
Bi; and E(b;; << Bxj) (Omi — Bmi) = COkmCij - 

(ii) The s;;’s are the unbiased estimates, each based on (say) n degrees of 
freedom, of o;; and follow the Wishart [7] law. Actually we have 


N 
ns35 = 2X (Ysa = Y ia) (Ya aie Yija)s 
where Y;.’s are sample regression functions. 
(iii) The c;;’s have been previously defined (Section 2). 
Under the assumption of the conditional bivariate normal law for the y;a’s 
(Section 2), the residuals of y;.’s from their respective sample regression func- 
tions are normally distributed. If the y;.’s are subject to a time trend, as may 


very often be the case in economics, it will be more appropriate to consider the 
model 


E(Yyia) = @& + o£ 1(t) + ae fo(t) + a + Ba(2e ij #1) + Bi2(Xe0 aa Z2) 


where the £(t)’s are known polynomials in time. Under such a model also residuals 
are known to be normally distributed. Consequently we might as well have 
assumed such a model which will thus only affect the number of degrees of 
freedom available for the estimates s;; . 

TuHeoremM II. If x and y are transformed contragrediently, the statistic U is an 
absolute invariant. 

Proor. Set s = || s;; || and b = || b,; || . Under the contragredient transforma- 
tion of x and y (Section 3) we have b = f’b*f; s = f's*f; and c = f’c*f. If we 
perform this transformation on U and simplify, we notice that the numerator 
and denominator of U are relative invariants of weight —2 and consequently 
U is an absolute invariant. 


4.2. Distribution of U under the null hypothesis. Since U is an absolute invariant 
under the contragredient transformation of x and y we may derive the distribu- 
tion of U taking o to be an identity matrix and 8 to be a diagonal matrix (Theo- 
rem I) in the parametric space. 

The numerator and denominator of U are distributed independently of one 
another [8]. Let Z = coos, + C1822 — 22812 . This is a positive definite quadratic 
form in normally distributed variates. Let u. and v. represent residuals of y; and 
yo from the corresponding sample regression functions. There exists an orthogonal 
transformation of the N variables wu, and v, which will simultaneously yield 


sn #2 +n *2 5 * * 
Su = LTiug /N; Se = Vfvg /nand sy = Tfugv,/n, where u* and v* are normally 





HYPOTHESES OF SYMMETRY 517 


and independently distributed with 0 means and a common variance for each 
set u*, v*. Consider now the orthogonal transformation 


, * @ 
Ug = Ua COS 8 — v, Sin 8, 


Ve us sin 6 + ve cos 8, 

where @ is so determined that nZ = d,2fu,? + d.2fv.2; then 
dy = 3fen + C2 + [(cu — Cm)” + 4eis}*}, 
ds = 4{en + C2 — [(ern — C22)” + 4eis}*}. 


Consequently U = xia(Cix3n + Coxi.n) where the x’s have independent 
x’-distributions with degrees of freedom as indicated in the second subscripts and 
Cy = (2n)"{1 + [1 — 4] ¢|/(on + cn)*F}, 

Co = (2ny*{1 — [1 — 4 |/(eu + en)"l', 

and |c| = cue — Cie. 

The distribution of a quantity similar to U was first obtained by Hsu [9]. An in- 
dependent derivation of the distribution of the quantity r = EQuxi + Nex3)?, 
where ; , A2 are certain positive constants and £ is N(0, 1), will also be found 
in [10]. Robbins and Pitman [11] have obtained general results for the distribu- 
tion of the ratio of mixtures of x”s, of which the form (4.2.1) given below is a 
particular case. 

We have the following two forms for the frequency function of U: 


go(U) = [B(n, $)U}(C2/C,)"” Cha + C2U)"> 
(4.2.1) 


-F (n + 4, 4n, n, ._= ot) 


14+¢.U0 


for any value of n, and 
¢n—1 
g(U) = U> p> (ramrayran — hTh+ Drain +1) — A 


-T(gn + h)(Ci — or 
-((—ayret* cra + CLT) + (— CHC + CU) rr), 


for n even [10]. 

We notice that since C; + C, = 1/n, the distribution of U essentially depends 
on C; or C2 and is, therefore, precisely determined by n and the quantity c/(tr c)’ 
(=a/(tr a)’) = w (say). If w > 4, C, and C, are both imaginary. When the 
matrix c is a 2 X 2 matrix, the truth of the relation 0 < w < } can also be 
verified independently. The relation (4.2.1) is not defined when C,; = 0, i.e., 
when w = 0; however it is clear from the form of U that it is distributed as 
Student’s ¢ with n degrees of freedom. If w = 3, C,; = C; = 1/(2n), and U has 
the ¢ distribution with 2n degrees of freedom. We shall refer to this again in the 


(4.2.2) 





518 UTTAM CHAND 


next section where we examine the overall behavior of the probability of Type 
I error of U with respect to w. 


4.3. Probability of Type I error of U. To derive P = P(U > Uo) corresponding 
to the form of the frequency function (4.2.1) we put ¢ = (1 + C.U)~ and 
after integration obtain 


P = (€,/C,)" > rGn + AIPGn)r(A + DIU — C2/C,)" 
(4.3.1) h=0 


-I;,(n + h, 3), 


where /;,(p, q) is the incomplete beta ratio and f) = (1 + C2U»)~*. The series 
(4.3.1) consists of positive terms and is absolutely and uniformly convergent. 
Corresponding to the form (4.2.2) for even degrees of freedom we similarly 
obtain 


jn—1 


P= DY rant A(TGn)r(h + Y)T(C, — C2) 
(4.3.2) ha 


[(—1)*CPC2 Tpe(gn — h, §) + (—D" CPCI (4n — h, 3)1, 
where {> = (1 + CiUo)™. 
Consider the series (4.3.1). Following Robbins and Pitman [11] if we set 
dy = (C2/C,)"T(4n + A\(L — C2/C) "(IP Gn)rh + DI, 


so that => d, = 1, we have 


Pp 


Pp 
0< P— Dal; (n + h, $) < (1 - > as) I;,(n + 2(p + 1), 4). 
h=0 


h=0 


i 
4 
q 
8 
rf 
i 


+ 
; 
4 
} 


eats 


For any given U> this inequality sets an upper bound to the error committed in 
P in stopping at the (p + 1)st term of the series (4.3.1) which has been found 
to be slowly convergent. Whenever n is even and not large, the finite form 
(4.3.2) is to be preferred for computational purposes. 

We now state the following theorem concerning the dependence of the proba- 
bility of Type I error of U on the variable parameter w: 

TueoreM III. For any n and fixed Uy , P(U > Uo | Ho) is a monotone decreasing 
function of the variable parameter w. 

Proor. We shall prove this result by considering the derivative of P with 
respect to w. From (4.3.1) we obtain 
dP 


—— = (C,/C;)""(1 — 40) z. (an + A) (4n)T (A + 1)" 
dw had 


i 9 h h— 
| 40 + (1 — 4w)*)77,,(n + h, $){4n(1 — C2/C,)" —h(C2/Cy)(1 — C2/C1)""} 


_ 9 Co/Cill — C2/C)"t0" — HL — (A - oh) 
fs B(n + h, 4) 7 





HYPOTHESES OF SYMMETRY 519 


which may actually be shown to represent a derivative. Following Hsu ((9}, 
pp. 14-15) the series 


DT (4n + A(T Gn) (h + 1))n(1 — C2/C,)" — h(C2/C) 0. — C2/Cy)") 
-Iz,(n + h, 3) 


can be shown to be equivalent to 
“(an + AMP 4n)P(A + 1)) “Pn + hh) — C2/Cy)"m, 


where 
m = (n + hy go*(1 — g0)'/Bin + h, 4) = Ieg(n + h, 4) 
— I;,(n + h + 1, }). 
After some simplification we obtain 


— = (1 — 4u)(C,/C)""(ne) 


> r(}n + b)(PGn)r(h + 170 — C2/C,)" 


-[$n(1 — 2nCi) + A(1 — nCi))m. 


The terms of the series (4.3.3) will be negative in the beginning but will finally 
become positive. Let the (r + 1)st term be the first positive term. Since m is a 
monotone decreasing function of h we have 


dP 


— <n(1 — 4w)*(C2/C,)" (nC) 
dw 


> rn + AEGn)r(h + “0 — ¢,/C,)" 
-[An(1 — 2nCy) + h(1 — nC))) 


= (1 — 4w)4(C2/C,)""(2nC}) 


“[(1 — 2nCy)(C2/C:)™ + (1 — C2/C,)(1 — nC)(C2/C)} 
= (. 


This proves the theorem except for the end point w = 0 of the interval 0 < w < ij, 
for which the series (4.2.1) and consequently (4.3.1) are not defined. To cover 
this point we need only to note that the cumulative distribution function (cdf) 
of the statistic U = x73((1/n)x3.n + Co(x3.n — X2.n)) is @ continuous function 
of C, and that when C, — 0, the cdf of U tends to the cdf of Student’s ¢ for n 
degrees of freedom. 

Having established the monotone nature of P with respect to w we are now 


in a position to assert that U could be regarded as Student’s ¢ with degrees of 
freedom lying between n and 2n. 





i 
H 
‘ 


520 UTTAM CHAND 


4.4 A p-variate generalization of U. The reader will at once recognize the 
following technique for obtaining the generalization of U to the p-variate case 
to be similar to that of obtaining Hotelling’s T from the ordinary Student’s ¢. 
Consider 


By = ay bre + ae bis + ag bos + sos + Cipgnts Ue~te> 
Buy = aba + a2bs, + agbse + +++ + appv dp.p-r- 


Let A denote the sample covariance matrix of the 4p(p — 1) symmetric 
differences. Define row vectors 


ag = (a, a, ++), by 5 (dy: , bis , ++), be - (ba , ba , e+) 


, / . 
and let a’, b; , b. denote the corresponding column vectors. If we regard a’s as 
constants, then 


1 = (By2 — Bu)’ _ fa(bs — be)’}’ 


Estimated var(By, — Bo) aAa’ 


is also distributed like U. If we determine a’s so as to maximize “U we at once 
find that a « (b, — b,)A~ and we have 


U = (b; = be) A7(by _ be)’, 
which reduces to U when p = 2. It can be shown after a very laborious simplifica- 


tion that U is also invariant under all contragredient transformations. The 
distribution of “U is still under investigation. 


4.5. The power function of U and its unbiased character. If we let 


5 = (Br — Bu)(C2on + Cuoe — 2c), 


we shall presently see that except for the noncentral x’ in the numerator the 
non null distribution of U is similar to its null distribution. We shall first indicate 
the results that can at most be accomplished in the non null case by the contra- 
gredient transformation of y and x. Noting that under this type of transforma- 
tion 6 and o are transformed cogrediently we state: 

Lemma 1. Jf Bi ~ Ba , there does not exist a nonsingular cogredient transforma- 
tion f which will reduce 8 to a diagonal matrix and o to an identity matrix. 

Proor. Suppose there exists an f such that 


feof’ = I, fef’ = D, 
where / is an identity and D a diagonal matrix. Therefore 8 = f ‘Df’"; 8’ = 
f Df’, yielding 8 = 8’, which is contrary to the hypothesis. 
Lemma 2. If Bi: # Bx , there exists a nonsingular transformation f which reduces 


p 
o too* = , 8 to another nonsymmetric matrix 8* and which leaves the standard- 
p 


ised “‘distanc e”” 6 between the two alter natives invar tant. 
oC ' 0 
11 


Proor. Such a transformation is given by f = ' 
0 O29 


. This completes 


the proof. 





HYPOTHESES OF SYMMETRY 521 


We may thus derive the non null distribution of U assuming on = co» = 1. 
We shall presently see that the power function of U depends only on one nuisance 
parameter p. 

To reduce the positive definite form Z in the denominator of U to a linear 
combination of two independently distributed x”s we proceed as follows: 

(i) There exists an orthogonal transformation which will simultaneously 
yield 8, = Zfzie/n; 82 = VPzza/N; $2 = Tiere%2za/n, Where Ziq and 24 follow 
a certain bivariate law. 

(ii) The transformation 


(1 — p) (za — pera) 


= 22a 


further reduces Z to a quadratic form in normally and independently distributed 
variates. 
(iii) A proper choice of @ in the orthogonal transformation 


Zia = 21a COS 0 — 232 Sin 0 

ie = Z12 Sin 6 + Zse Cos 0 

ensures the vanishing of sample covariance of zj. and 23. and we obtain nz = 

, 
m2zi a. + qo} zs, , Where gq: and gq depend upon p and the elements of the 
matrix c. 
Finally we have U = x14(y1x3.n + Y2x3.n)» Where x” is a noncentral x* and 

v1 (2n) [1 + (1 — 4) ¢j(1 - p (cn + Ce — 2pcr2)*)'], 
v2 = (2n)"[1 — (1 — 4] ¢\(1 — p’)(eu + C22 — 2pey)~*)'). 

We observe that if the covariance matrix is an identity matrix, the values of 

‘v1 and 2 check with the values of C; and C2 (Section 4.3). 


Following Hsu [9] we obtain the following forms for the non null frequency 
function and power function of U: 


20 r+$yyr-4 r\—n—r—4 
1) = oo yin cgg2yr Y2_ UL + ¥2U) 
g(l ) e (y1, 2) dX (35°) T(r +. 1)B(n, r + 3) 
(4.5.1) 


7 1 — ¥2/" 
-i (n + rt 3, 4n,n, — r) 
~* 1+ nU 


and 


B(5, p, n) = & (2/41) 2 > Gey Tan + A) = 2/9)" 


=) re=( 


(4.5.2) 

‘(P(n) P(r + ITA + LI'L. (n + hy r + 4) 
where F denotes the hypergeometric function and a = (1 + 72U»)~*. Because 
of the fixed relation 71 + y2 = 1/n either of the above two results could be 


expressed in terms of +; or 2 and consequently p is the only nuisance parameter 
present in (4.5.1) and (4.5.2). 





522 UTTAM CHAND 


To show that U provides an unbiased test for the hypothesis 8: = Bx. we 
state the following theorem: 

THEorREM IV. For any n and fixed p the power function B(6, p, n) is a monotone 
increasing function of the standardised “distance” 5 beween the two alternatives. 

ProoFr. Consider the double series 


¥ > (38°)'T(gn + A)(L — y2/ys)" Gn) T(r + TCA + DI Tag(n + h, + 4) 


r=0 h=0 


which is dominated by 


a (-ys/v2)*"(48°)’ / r! 


This latter series has infinite radius of convergence and consequently we can 
differentiate (4.5.2) term by term. Setting 45° = A* and differentiating we 
obtain after simplification 


dB(5, p,m) _ (y2/v)"e* > YS (gn + A)(1 — y2/41)"A*” 


oA* h=O reO 
-<(Tan)r(h + DP(r + 1))"[La,(n +h, r + $) — Ia,(n +h, r + 4)). 


Since [,,(n + h,r + $) — Ia,(n + h, r + 4) > 0, therefore 08(6, p, n)/dA* > 0. 
This proves the theorem and establishes the unbiased character of the test 


based on U. 


REFERENCES 

{1] H. Horexuine, ‘‘Edgeworth’s taxation paradox and the nature of demand and supply 
functions,’’ Jour. Polit. Economy, Vol. 40 (1932), pp. 577-616. 

[2] H. Scnutrz, The Theory and Measurement of Demand, University of Chicago Press, 
1938, ch. 18. 

[3] M. Bocuer, Introduction to Higher Algebra, The Macmillan Company, New York, 1907. 

[4] H. Hore.urna, ‘Relations between two sets of variates,’’ Biometrika, Vol. 28 (1936), 
pp. 321-377. 

[5] H. Hore.urna, “Spaces of statistics and their metrization,’’ Science, Vol. 67 (1928), 
pp. 149-150. 

[6] H. Horeturne, “Commodity transformations and matrices,’ Annals of Math. Stat., 
Vol. 10 (1939), p. 88. 

[7] J. Wisuart, “The generalized product-moment distribution in samples from a normal 
multivariate population,’’ Biometrika, Vol. 20 (1928), pp. 

[8] S. S. Witxs, Mathematical Statistics, Princeton University Press, 1946, pp. 245-247. 

[9] P. L. Hsu, ‘‘Contribution to the theory of Student’s t-test as applied to the problem of 
two samples,’’ Stat. Res. Memoirs, Vol. 2 (1938), pp. 1-24. 

{10} Urram CuHanp, ‘Distributions related to the comparison of two means and two re- 
gression coefficients,’’ Annals of Math. Stat., Vol. 21 (1950), pp. 507-522. 

[11] H. Rossrns ano E. J. G. Prrman, “Application of the method of mixtures to quadratic 
forms in normal variates,’’ Annals of Math. Stat., Vol. 20 (1949), pp. 552-560. 

{12] Tsaturne C. Koopmans, ed., Statistical Inference in Dynamic Economic Models, Cowles 
Commission Monograph 10, John Wiley and Sons, New York, 1950 





EXTREMAL PROPERTIES OF EXTREME VALUE DISTRIBUTIONS 


By Sicertr1 Moricuti 
University of North Carolina 


Summary. The upper and lower bounds for the expectation, the coefficient of 
variation, and the variance of the largest member of a sample from a symmetric 
population are discussed. The upper bound for the expectation (Table 1, Fig. 1), 
the lower bound for the C.V. (Table 2, Fig. 4) and the lower bound for the vari- 
ance (Fig. 7) are actually achieved for the corresponding particular population 
distributions (Figs. 2, 3, 5, 6, equation (5.1)). The rest of the bounds are not 
actually achieved but approached as the limits, for example, for the three-point 
distribution (Section 3) by letting p tend to zero. 


1. Introduction. The sampling distribution of the largest or the smallest 
member of a sample has been studied by several authors; Tippett [1] and de 
Finetti [2] considered a sample from a normal population, Olds [3] from a rec- 
tangular population. The case of a very large sample was treated by Dodd [4], 
Fisher and Tippett [5], and Gumbel |6], each for a certain class of population 
distributions. 

Here we consider the upper and lower bounds for the expectation, the coeffi- 
cient of variation, and the variance of the extreme member of a sample from a 
symmetrically distributed population with a finite variance. To be specific, we 
will discuss only the largest member and take the mean of the population equal 
to zero. These conventions do not imply any essential restriction. 


2. Notations and formulas. Let the cumulative distribution function (cdf) 
of the population be denoted by F(x); then the cdf of the largest member 7, of 
a sample of size n is given by {F(x)}”. Hence the expectation of the largest 
member can be expressed by 


(2.1) te a | * anf F()}"" dF(2). 


Now we consider the inverse function z(F) of F(x), with an obvious additional 
definition at points of discontinuity, if any, of F(x). Thus (2.1) can also be writ- 
ten as 


1 
(2.2) E(z,) = [ 2(F)nF"™ dF. 


Because of symmetry, x(F) = —2z(1 — F) holds almost everywhere, whence 


1 
(2.3) oe / 2(F)n{F" — (1 — FY") dF. 


523 





524 SIGEITI MORIGUTI 


Similarly, we get as the variance 
(2.4) V(r.) = [ f2(F)\°*n{ FF" + (1 — F)""} dF — {E(z,)}?. 
The population variance is of course given by 
(2.5) = 2 f {2(F)}? dF. 
3. Bounds for the expectation of the largest member. In Schwarz’s inequality 
(3.1) ([ seman ar) < [ (f(F)}? dF [ {g(F)}* dF, 


putting a = 3, b = 1, f(F) = 2(F), g(F) = n{F"™ — (1 — F)""}, we get a 
formula which means, in view of (2.3) and (2.5), that 


1 4 

S n—1 n—1)2 
(3.2) Ele.) $ 5m (/ (Po 1 — Fy} a) 
equality being satisfied if and only if f(F) = const.-g(F), that is, 


(3.3) x(F) = const. - ie —_ (1 ao rn. 


Therefore the expectation of the largest member has the right-hand side of 
(3.2) as an upper bound, which is actually achieved for a particular type of 
population distribution given by (3.3). 

The integral in (3.2) can easily be evaluated as follows: 


1 
[ rt -a-p yar 
$ 


1 
(3.4) =5f w+ a - Fy - 2F"G - FY) dF 
0 


= 3 . + sas oe 2B(n, n)| = a — B(n, n), 


2n — | 2n — 1 2n 


where the Beta function of equal integral arguments can also be expressed as 


1 


(3.5) B(n, n) = (Qn — Dom? 


Thus the upper bound for E(z,) is given by 


n 1 \i 


The numerical value of the coefficient is calculated for various sample sizes and 
compared with the values of E(z,)/o for normal and rectangular populations 
in Table 1 and Fig. 1. It is to be noted that the value for a normal population 


= 


is remarkably close to the upper bound if n S 7. The cumulative distribution 





EXTREME VALUE DISTRIBUTIONS 525 


curve and frequency curve of the extremal distribution (3.3) is. illustrated in 
Figs. 2 and 3 for several values of sample size n. 

It is obvious that the expectation of the largest member has the lower bound 
zero. However it may be of some interest to see that this lower bound can be 
approached as closely as one desires. One of the simplest ways is to consider the 
three-point distribution, such as the values a, 0 and —a occurring with proba- 


TABLE 1 


Expectation of the largest member in the unit of o, E(x,)/o 


For normal For rectangular 


Sample size n Upper bound distribution* distribution 


5774 5642 .5774 
.8660 8463 .8660 
.0420 0294 .0392 
.1701 .1630 1547 


.2767 . 2672 . 2372 
3721 .3022 . 2990 
- 4604 -4236 .3472 
5434 -4850 3856 
.§222 .5388 4171 


.6974 . 5864 1.4434 
12 . 7693 . 6292 .4656 
13 8385 . 6680 .4846 
14 - 9052 . 7034 .o011 
15 . 9696 .7359 .5155 


16 
17 
18 
19 
20 


.0320 . 7660 .5283 
.0926 .7939 .5396 
.1514 .8200 .5497 
. 2087 .8450 .5588 
2645 .8673 .5671 


N bd bd bd bo 


* From [9], p. 165. 





bilities p, 1 — 2p, and p, respectively. If we make p approach zero for a fixed 
sample size n, the ratio E(x,)/¢ also approaches zero, because in this case 


(3.7) E(z,) = nap + O(p’), 
(3.8) o” = 2a’p. 


4. Bounds for the coefficient of variation of the largest member of a sample. 
Putting in (3.1) a = 3,6 = 1, and 





SIGEITI MORIGUTI 


2(F)V/n{F" + (1 — F)*}}, 
(4.1) Fy ow Val - a- 
Wy = TPaI+Fd-r/h}”’ 
we get a formula which means, in view of (2.3) and (2.4), that 


V (rn) 1 
9 > ou 
(42) Ee. = MM. ” 


(tio 


wate 
—- 
quiation 


pectonguier POT” ~ 


e 
— 





S734 5 6 78 FY 0 It i2 8 14 1 6 17 18 IF 20 
—~- 71 Sample size 
Fic. 1. Expectation of the largest member 
where 


'niF — (1 — F)*"}’ 
7 =_ Facieiinedanaiiaidigt aatianisiet edie SE 
(43) ns J, rasa-A7 


The equality in (4.2) is satisfied if and only if f = const.-g, i.e. 


” Ft — Fy 
(4.4) z(F) = const. - Popa] Fy 


Therefore the cgé@fficient of variation of the largest member has +/(1/M,) — 1 
as a lower bound which is actually achieved for a particular type of population 
distribution given by (4.4). 





EXTREME VALUE DISTRIBUTIONS 


-l-0 oO 
m__ = 
Fia. 2. Extremal distribution (cdf) for Z(z,) 


~ 
+ 
3 
i. 
— 
a 
w 
“ 


A 


\ 


—= X 


Fic. 3. Extremal distribution (pdf) for E(z, 





528 SIGEITI MORIGUTI 
The integral (4.3) can be evaluated by an elementary method of quadrature. 
To show the results for small values of n, 


1 
M2 = 0.33333, 
2 = 3 333 


3- ; x = 0.64381, 


~2 + 0.6r{ (0.704 + 0.8/0.8) 
é 


V5 — 20/5 + 2(0.704 — 0.8x/0.8) V5 + 2/5} 
0.95300. 


TABLE 2 
Coefficient of variation of the largest member 


For normal For rectangular 


S: > size n Lower bounc { 
ample size o 1 population* population 


1.4142 1.4634 1.4142 
. 7438 . 8838 7746 
.4737 .6812 9443 
.3203 «Oboe .4226 


.2221 . 9089 .d464 


As the sample size n increases, the evaluation of .V, by quadrature becomes 
more and more laborious. Numerical integration would be preferable for larger 
values of n. In this case, however, we can derive (see Appendix 1 for the deriva- 
tion) an asymptotic formula of M, for large n 


1\ | 
5 M, = ion ee } s) 
(4.5) 1 1 zli+o(} | 


which happens to be a fairly close approximation even for as small a value of 
n as six, where this formula gives 0.95091. Using these results, we compare the 
lower bound with the value of the C.V. of the largest member for a normal 
population and a rectangular population, as in Table 2 and Fig. 4. 

It is interesting to observe that the C.V. of the largest member of a sample 
from a two-point population, such as values 1 and —1 each occurring with 





EXTREME VALUE DISTRIBUTIONS 529 


probability 1/2, behaves asymptotically similarly to the lower bound except 
for a numerical factor +/x/2. In fact, we can easily derive in this case the fol- 
lowing formulas 


1 


9n-1? 


(4.6) E(z,) = 1— V(z,) = 


Jn—-2 


_ 


20 


26 


° 
n 


° 
> 


& 


? 
y 
~~ 
= 
g 03 


05 
 & @ a | a a. 


——P 71 


Fic. 4. Coefficient of variation of the largest member 


(47) VV@) __V@—-)e 1. 
E(z,) Q-1 — ] Dan-1 
This similarity in the asymptotic behavior may be taken to be the reflection 
of the similarity in the population distribution, which is seen in comparing Figs. 
5 and 6 with the corresponding graphs for the two-point distribution. 
There is no finite upper bound for the coefficient of variation of the largest 





530 SIGEITI MORIGUTI 


member. It can be proved, for instance, by observing the behavior in the case 
of the three-point distribution mentioned in the previous section when p ap- 
proaches zero for fixed n. In fact, in this case, it is easy to show that 

V(zn) = nap + O(p’), 


V V (an) 7 


x= F'-(1-F)™" 
F™'+(I-F YP" 


Fic. 5. Extremal distribution (cdf) for C.V.(z, 


5. Bounds for the variance of the largest member.' As we shall prove, V(z,) 
has a lower bound \,0", which is actually achieved for a particular type of popu- 
lation distribution given, when F is not 0 or 1, by 
a 20 ert 
n{Fe-) + (1 — F)} — 2,’ 
where \,, is the only root of the equation’ 

(ir ee 
y n{ Fr + (1 — F)*"} — 2a 
in the interval 0 < \ S n/2”™. 


(5.1) x = const. - 


(5.2) M,(a) = F=1 


1 A heuristic derivation of the formulas (5.1) and (5.2) is given in Appendix 2. 
* The notation is such that 17,(0) equals M, as previously defined. 





EXTREME VALUE DISTRIBUTIONS 531 


First, in order to prove that there exists one and only one root of (5.2) in 
the stated interval, it is sufficient to show that 


(5.3) M,(0) <1, =M,(n/2"") > 1, 


and to note that M,(A) is a monotone increasing continuous function of \ in 
the interval. Since 


1 
(5.4) [ n{F™" + (1 — F)""} dF = 1, 
‘ 


Y 
S 


f) 


> 
« 
v 
= 
“SS 
' 
| 


—— 


-10 0 
Fig. 6. Extremal distribution (pdf) for C.V.(z,) 


we have, for any d in the interval, 
1 aaa M,(d) 
n'{F* + (1 — F)""}? — 2\n{ PF" 


[ + (1 ae F)""} . n'{F" a (1 ae F)"""}? 
' we ee iF 4 


7 [ 4n?F™""(1 — F)"* — 2an{F"" + (1 — F)""} 
? i n{Fe-) + (1 — F)"} — 2y 


dF 
dF. 


ae . oa} +\n—1 i 

On the other hand, it is obvious that F” " + (1 — F)””~ assumes a minimum 
n—-2 ym—1, y\n—-1 . «2n—-2 7 f rt 

value 1/2””*, and F””"(1 — F)"” a maximum value 1/2°"” at F = 1/2. There- 





532 SIGEITI MORIGUTI 


fore, in the interval’ 1/2 < F < 1, the denominator of the integrand is always 
positive, the numerator being always positive for \ = 0 and always negative 
for \ = n/2””*. Hence we get (5.3). The above mentioned nature of M,(A) is 
also obvious (ef. the definition (5.2) and the above statement about the de- 
nominator). 


Next, again in the Schwarz’s inequality, let us put a = 1/2, b = 1 and 
(5.6) f(F) = x(F)[n{F*" + (1 — F)"7} — 2a, ]', 
n{F"" aa. (1 —_ F)""} 
[n{ Fo + (1:-— F)™} — 2d.) 
Then we obtain a formula which means, in view of (2.3), (2.4), (2.5) and M,(A,) = 
1, that 


(5.7) gQ(F) = 


(5.8) V(an) 2 Ano’, 


equality being satisfied if and only if f = const.-g, ie. (5.1) holds. Thus the 
statement at the beginning of this section has been proved. 

The numerical evaluation of \, requires a little more effort than the evalua- 
tion of M, in the previous section, as the former requires solution of a trans- 
cendental equation after an integration. For instance, for n = 3, \3 can be 
obtained by solving 


1 


==} é 
tan es ———__—__— 
4 t 
i ae 3 a ian 
V/ a V/ —_" 


as \; = .394. For n = 4, we have to solve 
5 = 2 of F 3 88 — 5dr 
Rina iin tan: se jain 
kw V i —Dr 104 + a)? 


to get 4, = .209. Moreover, when n = 7, the quadrature itself is tedious. For 


large n, however, an asymptotic formula is again available as shown in Appendix 
3. It is closely related to (4.5), and takes the form 


(5.9) n= = E +0 (2)]. 
a n 


Again it is fairly close even if n is small. 

The general picture is seen in Fig. 7, in which the lower bound of ~/V(z,)/c 
is shown together with the value for normal [7], rectangular, and two-point 
distributions. 

As for the upper bound, it is easy to see, from (2.4) and (2.5), that 


1 
- r 2 2 
(5.10) V(z,) < n| x(F)° dF = ino, 
4 

> The suspicion about the singularity which might occur in the case of \ = n/2"* at F =} 
is dissolved if we note that the numerator also has a zero of the second order at F = 3. 


‘For n = 2, (5.1) reduces to a rectangular distribution, for which no more calculation 
is necessary. 





EXTREME VALUE DISTRIBUTIONS 533 


for F”"* + (1 — F)"” is a monotone increasing function taking the value unity 
at the end F = 1 of the interval. The value n/2 of the ratio V(z,)/o* can be 


a / 
pou 
Were 


a A. (S'8 7 FF: F 


—_» 72 


Fia. 7. Standard deviation of the largest member 


approached as closely as desired, for example, for the three-point distribution 
(Section 3) by letting p be sufficiently small. (See (3.8) and (4.8).) 


6. Final remarks and acknowledgement. We considered the upper and lower 
bounds for the expectation, the coefficient of variation, and the variance of the 
largest member of a sample from a symmetric population. The upper bound for 





534 SIGEITI MORIGUTI 


the expectation and the lower bound for the C.V. or the variance are actually 
achieved for particular distributions, which we may call ‘extremal distribu- 
tions”. These distributions as well as the values of the corresponding bounds 
were first obtained, as illustrated in Appendix 2, by applying the techniques of 
the Calculus of Variations. The same methods can be applied also to the dis- 
tribution of the range’ of the sample and some other useful statistics 

The writer is indebted to Professor Harold Hotelling for his suggestions which 
induced him to undertake this study and for his kind guidance and encourage- 
ment in the course of study. 


APPENDICES 


1. Asymptotic formula for M,. Putting A = 0 in (5.5), we get 


1 n=] \n—I 

ig” = 2). 

ot .* | seen ? 
1 I ,F "> 0 - FR dk 


With the change of variable 1 = 2F — 1, this integral becomes 


1 [ nil — Ss is 


I+ (Yj —- p= 


1-—- M,= - 
+ ay" 


dt. 


2v* & 


When n increases indefinitely, 


therefore, 


’ Thanks are due to Professor Olds at Carnegie Institute of Technology for calling the 
author’s attention to R. L. Plackett’s paper [8] which derived essentially the same result as 
given in Section 3 of the present paper by a somewhat different approach. 





EXTREME VALUE DISTRIBUTIONS 


But 


” 7 
tan™e” = — 
2 


n 
wT 1 
l1—- M,= zl +0(2)]. 


2. Derivation of (5.1) and (5.2). In order to minimize (2.4) under the condition 
that (2.5) is kept constant, we put the first variation of 


Therefore, 


This is (4.5). 


I 2(F)'n{F"" + (1 — F)"") dF — {E(,)}? — an | 2(F)? dF 
t 


equal to zero, of course taking account of (2.3). Thus we obtain as the charac- 
teristic equation 


a(F)n{F™ + (1 — F)"*} — E(a,)n{F"™" — (1 — F)""} — 2a2(F) = 0, 
which can easily be solved as 
E(z,)n{F"™ oe (i a F)""} 
n{ Fe + (1 — F)} — 20° 
But this solution is eligible only if it satisfies (2.3), that is only if 

n'{F*" a (1 ae eT. 
n\{ Fo + (1 — F)*"} — 2d 
As E(z,) cannot be zero except in the trivial case x(F) = 0, \ must be a solution 
of (5.2). If there exists a solution \, as is actually the case, then 

n{F™* — (1 — F)""} 
nj Fr + (1 — F)*"! — 2dr, 


x(F) = 


1 
E(z,) = E(z») i dF. 


x = const. - 
is eligible as a solution of the characteristic equation. 


3. Asymptotic formula for 2,,. M,(A) can be transformed as follows. 


iP + (1 — Fy} - 20 - FET 
M0) = [pe oe 


1 
= / | mie + (1 — F)""} + 2x — 4n(1 — FF)" 
i 


{2\ — 2n(1 — F)""}? 


+ n{F > + (1 — F)*""} — 20 


t {2x — 2n(1 — FY)" 


\nPot ad — PR) — oO 


4 
si+i-=fTf 





536 SIGEITI MORIGUTI 


Therefore, \,, must satisfy the equation 


og [ (2h, — 2n(1 — F)*"'}’ 
" + » n{ Fe + (1 — F) 7} — 2d, 


As the integral] is positive, we get 4, < 4/2”. This inequality certifies that the 
last term of the denominator in the last integral, or in (5.5), can be neglected 
as of order 1/n times that of the first term. Therefore 


EE <n ,(1 
An raya — Fy dF E +O (2)| 
- uo [r+o(2)] 
n 
[1 +0(!)], 
n 


REFERENCES 


dF. 


| L. H. C. Tiprert, ‘“‘On the extreme individuals and the range of samples taken from a 
normal population,’’ Biometrika, Vol. 17 (1925), pp. 364-387. 
. DE Finetti, “‘Sulla legge di probabilita degli estremi,’’ Metron, Vol. 9 (1932), pp. 
127-138. 
5. G. Oups, ‘Distribution of greatest variates, least variates and intervals of variation 
in samples from a rectangular universe,’’ Bull. Am. Math. Soc., Vol. 41 (1935), 
pp. 297-304. 
). L. Depp, ‘‘The greatest and the least variate under general laws of error,’’ Trans. 
Am. Math. Soc., Vol. 25 (1923), pp. 525-539. 
. A. FisHer anv L. H. C. Tippert, “‘Limiting forms of the frequency distribution of 
the largest or smallest member of a sample,’’ Proc. Cambridge Philos. Soc., Vol. 
24 (1928), pp. 180-190. 
>. J. GUMBEL, “‘Les valeurs extrémes des distributions statistiques,’’ Annales Institut 
Henri Poincaré, Vol. 4 (1935), pp. 115-158. 
. J. Gopwin, ‘“‘Some low moments of order statistics,’’ Annals of Math. Stat., Vol. 20 
1949), pp. 279-285. 
} R. L. Piackert, ‘Limits of the ratio of mean range to standard deviation,’’ Biometrika, 
Vol. 34 (1947), pp. 120-122. 


[9] K. Pearson, Tables for Statisticians and Biometricians, Part II, lst ed., Cambridge 
University Press, 1931. 





THE FITTING OF POLYNOMIALS BY THE METHOD OF 
WEIGHTED GROUPING 


By P. G. Gurst 
University of Sydney, Australia 


Summary. A method of fitting polynomials to equally spaced data is developed 
which is more rapid than the method of least squares. The orthogonal poly- 
nomial 7';(x) of the least squares method is replaced by a step function w,(z), 
and this greatly reduces the number of multiplications. An efficiency of about 
90 per cent is obtained for the estimates of the coefficients and fitted values. 


1. Introduction. An appreciable shortening in the time required to fit a curve 
to a series of n equally spaced observations y(x) is effected by the use of tables of 
the orthogonal polynomials T ;(x) or £3(x) [1], [2], [3]. However, the process is still 
tedious if the number of observations is at all large. A considerable time is spent 
in the calculation of the orthogonal moments ET ;(x)y(x), and a mistake in these 
calculations can easily be made. 

In the present paper a method of curve fitting is developed which considerably 
reduces the time required for the calculation of the moments. The continuous 
function 7';(x) is replaced by a step function w(x). The observations y(x) are 
summed over each interval of constancy of w,(x). The groups so formed are 
multiplied by the weighting factor w,;(x) and added to give =w,(x)y(z). 

The number of groups required is found to be 3(j7 + 1) or (7 + 2) according 
as j is odd or even. Thus for the coefficients of the fourth and fifth degrees the 
number of multiplications is reduced to three; the number of weights which have 
to be tabulated is also reduced to three. 

The estimates of the polynomial coefficients and fitted values obtained by the 
method of grouping are all unbiased, and have an efficiency of about 90 per cent. 
This means firstly that the standard error of the value obtained by the method of 
grouping is about 5 per cent greater than the standard error obtained by the 
method of least squares, and secondly that the probability that the difference 
between the values obtained by these two methods exceeds their standard error 
is very small. In practically all cases this efficiency will be quite adequate [4]. 

A pleasing feature of the method of grouping is that the calculation of the 
coefficients van be carried out easily without a calculating machine, at least for 
polynomials of lower degree than the fourth. The calculations can of course be 
done much more rapidly if a machine is used. 


2. Estimation of the power series coefficients. To form an unbiased estimate 
of the coefficient b,; in the polynomial 


Pp 
(1) u,(x) = >. b,; 2’ 


j=0 


537 





538 P. G. GUEST 


which is to fit the observed values y(x), we must (directly or indirectly) multiply 
each observation by a weight w,j;(z) so chosen that 


(2) = w,;(xz)z* = 0, k< pk #¥j, 


z 


the sum being taken over the n observations. Then the estimate of b,; is 


(3) bpj = L wp;(x)y(x)/Z wy;(x)z’. 

The fact that w,,;(x) depends not only on j but also on p is a disadvantage. A 
useful system of weights is obtained by selecting weights w,;(x) which are linear 
functions of weights w;(x) independent of p, w;(x) being chosen so that 
(4) = w;(x)z* = 0, k< jp 


z 


We can then write w,;(z) in the form 


Wpj(x) w(x) W341(x) w,(x) 
(5) aa ee =! + Bits = i+ a tess + Bes Receel ‘ 

2W; Zz! 2W; 2! 2W 412? 2Wp x? 
where w; represents w;(x) and where the coefficients 8,; are determined from the 
condition 


(6) Zw,;(x)2* = 0 j<k<p, 
that is, 


x x k 
2w;x TW j412 Du, Xr 


+ Bi+ + ---+ hj —; = 0. 
Dw; 2 Biv. Dwjyi2't! ” Dur z* 

The advantage of using such a system of weights is that we can introduce 
statistics 


(8) a; = 2 w,(x)y(x)/Z w;(x)2’ 


iz 


which are independent of p and express b,; as a linear function of these statistics. 
In fact it follows from equations (3) and (5) that 


(9) bps = G5 + Bj41,30j41 + +++ + Bypsay. 


The method of least squares is a particular.case of this method of weighting 
[5). In the method of least squares w;(x) = T(x), the orthogonal polynomial of 
degree j. 

The calculation of the coefficients 8,; can be done most conveniently by 
evaluating the quantities 


(10) aj = —z w;(x)x* /= w;(x)x’. 


z z 


Then equation (7) becomes 
(11) Bxj = Gi T Ox, 5418 541,; + coo OY r—18k-1, 5 ’ 


and the coefficients 8 can be built up in turn from the coefficients a. 





FITTING OF POLYNOMIALS 539 


3. The method of weighted grouping. In the method of weighted grouping 
we replace the continuous function T ;(x) occurring in the least squares solution 
by a step function w,(x). In effect we assign the same weight w,(x) to all ob- 
servations in a region where T ;(x) is fairly constant. 

The criterion used in the choice of groups is that of maximum efficiency for the 
coefficient a; defined by (8), with w,(x) satisfying (4). The minimum number of 
groups required is thus 7 + 1. It would be possible of course to choose a larger 
number of groups, but this complicates the method without producing any great 
increase in efficiency. Adopting the value 7 + 1 for the number of groups, the 
values of the weights for each method of grouping are uniquely determined 
(except for an arbitrary multiplying factor) by equation (4). 

When the observations are equally spaced, it is most con¥enient to change to 
a variable e whose origin is at the centre of the points of observation x, and whose 
scale is such that the interval between successive observations is unity. An ob- 
vious simplification of the method of grouping for equally spaced observations 
is to make the groups symmetrical about the origin; that is, to take w;(—e) = 
(—)’w,(e). This reduces the number of different weights to $(j + 1) or $(j + 2) 
according as j is odd or even. Also 8, = 0 when j + k is odd. The observations 
are to be grouped by adding corresponding observations y(e) of equal | ¢| if 7 is 
even and subtracting corresponding observations if j is odd. 

It does not seem feasible to calculate general formulae for the method of 
grouping to give maximum efficiency. However, it is relatively simple to calculate 
the efficiency for a particular value of n and any chosen arrangement of groups, 
and hence to determine the best method of grouping for each n. 

The important question is whether the efficiencies will be high enough to make 
the method a satisfactory substitute for the method of least squares. The maxi- 
mum efficiency for large n (greater than about 50) is found to be practically 
constant for each coefficient. The efficiencies are listed below, and are seen to 
be all in the region of 90 per cent. For smaller n the efficiencies tend to be some- 
what higher than these values. 


ay 100% by 93.9% boo 91.2% 
88.9% bs; 92.5% bs: 91.2% 

a, 89.7% bee 89.5% 

az; 90.1% bss 91.2% 

a, 90.4% 

as 90.6% 


The efficiency of the estimate u,(x) of the fitted value at a point varies somewhat 
with the location of the point, but is always close to 90%. 

The coefficients b are linear functions of the coefficients a, but the method of 
grouping which gives the greatest efficiency for the coefficients a does not in 
general correspond to that method which would give maximum efficiency for 
a particular b,; , since the coefficients a; , unlike the corresponding least squares 
coefficients, are not orthogonal. But the choice of more complicated weights 
leads to only a slight improvement in the efficiency, and the method using 
weights w; independent of p is much more convenient. 





540 P. G. GUEST 


4. Tables and illustrative example. The following quantities are tabulated 
for values of n from 7 to 55 and for polynomials up to the fifth degree: 

(a) The best method of grouping for the estimation of a; , together with the 
weights w;(e). 

(b) The divisor = w;(e)e’. 


(c) The coefficients 6;; . 

The coefficients 8,; are not in general integers. 8 and 83 are tabulated in full 
(r signifying that the last figure is repeated indefinitely), while 8 and 53 are 
given to ten significant figures and 8 and 8, to nine significant figures. 

In the tables the observations are numbered by the value of | ¢ | if n is odd and 
by the value of | « | +4 if n is even. For example, for 62 observations the numbers 
are 1 to 31, for 63 observations 0 to 31. For even values of 7 observations of 
equal | ¢ | are added, while for odd values they are subtracted. This is indicated 
by the suffix + or — under the summation sign. The expression c(a—b)means 
that the observations numbered a to b (inclusive) are to be grouped and multi- 
plied by the weight c. 

It is convenient to illustrate the use of the tables by a specific example. We 
shall use the example of Birge and Shea [6], the measurements being the fre- 
quencies of the first 25 lines of the P branch of a CuH band. The frequencies 
vary from 22,330.52 to 23,295.47. After subtracting the constant amount 22,300 
from each observation, the values are written down as in Model Form 1, starting 
from the bottom of the left-hand column and working up this column down the 
right-hand column. 

The groups, weights, and §-coefficients are then entered in Model Form 2. 
Lines are drawn in Model Form 1 to indicate the sums required. For example, 
> (6—10) occurs in ay, so a line is drawn to the right between 5 and 6 and 


+ 
another line between 10 and 11. 


The sums of the corresponding terms in the columns are added starting from 
the top, the progressive totals being entered at the right wherever a line is 
drawn. The differences between the corresponding terms are next added, the 
progressive totals being entered at the left. The required sums are obtained as 
differences of the progressive totals; for example, 


> (6—10) = 12873.55 — 7017.03. 
As a check, the right-hand and left-hand columns are added. If the sums are 


R and L, the final = total should be R — L, the final = total R + L(+ y(0)). 
_ + 


The calculations indicated in Model Form 2 are then carried through. It is 
not necessary to record the actual sums. In working out a on a calculating 
machine, the steps are 


11(14968.38) — 11(11765.71) — 6(7017.03) = 7370. 





FITTING OF POLYNOMIALS 


TABLE OF WEIGHTS 


a= Zwiy/Dwie 


pe + Bu, 


| * 
Ban Ww, Zwie Ba Bs 


10.56 1(2—3) 10 —7 
—5.25 21 .8352273 1(2—4) 15 —8. 46.5625 
—6.6r 32.3478261 | 1(2—4) 18 —ll1 86 .6666667 
—8.25 56 .0782895 1(8—5) 21 —14. 152.0625 


—10 56 .4324324 1(2—5) 28 —16 171 .578947 
—11.916r 924430970 1(3—6) 32 —19, 77 . 198864 
—14 118.484210 1(3—6) 36 —24 400 .307692 
—16.25 178.9125 1(3—7) 45 —26. 555 .757415 
—18.6r 259.2 1(3—7) 50 —31 761. 176471 


—21.25 363 .085227 1(4—8) 55 — 36.28 1054 .98355 
—24 440 .228571 1(3—8) 66 —39 1200.25 

—26.916r 591 .596399 1(4—9) 72 —44. 1550 .67361 
—30 596 .689655 1(4—9) 78 —5l 2053 .08475 
—33.25 700. 141810 1(4—10) 91 —5A. 2471 .49762 


—36.6r 915.2 1(4—10) 98 —61 2677 .97849 
—40.25 1174.97540 1(5—11) 105 —68. 3303 .74133 
—44 1485 1(4—11) 120 —72 4017 .72973 
—47.916r 1699 .51705 1(5—12) 128 —79. 4867 .71991 
—52 2103 .90448 | 1(6—12) 136 —88 5462 


— 56.25 2575. 1575 1(5—13) 153 —92. 6518 .06250 
—60.6r 2593 .69038 |} 1(5—13) 162 —101 7717 .36937 
—65.25 2902 .9725 1(6—14) 171 —110. 9321 .53708 
—70 3505 . 46342 1(5—14) 190 —115 10591 . 5486 
—74.916r 4194 .79821 1(6—15) 200 —124. 12598 .6870 


—80 4653 .87610 1(6--15) 210 —135 13542 . 5807 
—85.25 5500 .39123 1(6—16) 231 —140. 15553 .9937 
—90.6r 6455 .21590 1(6—16) 242 —151 17882 .4828 
—96 .25 7105 .31250 | 10—17) 253 — 162. 20818 .9764 
—102 7099 .2 1(6—17) 76 — 168 23577 


—107.916r 8260 .38603 1(7—18) 288 —179. 24189 .5306 
—114 9554 .95384 1(7—18) 300 —192 27885 .1777 
—120.25 10396 .6582 1(7—19) 325 —198. 28898 .9048 
—126.6r 11927 .1611 1(7—19) 338 —211 33115 .0058 
— 133.25 13617 .3606 | 1(8—20) 351 — 224.25 37136 .3294 


—140 14739 .4967 1(7—20) 378 —231 41372 .4706 
—146.916r 14734.2725 1(8—21) 392 —244. 46135 .2764 
—154 16722 .8852 1(8—21) 406 — 259 52035.7419 
—161.25 18901 .3609 | 1(8—22) 435 — 266. 57467 .5970 
—168.6r 20294 .6356 |} 1(8—22) 450 —281 60364 .8750 


—176.25 22803 . 7383 1(9—23) 465 —296. 67631. 1107 
47 —184 25533 . 8680 1(8—23) 496 —304 73189 .0720 
—191.916r 28497 .1290 1(9—24) 512 —319. 81530 .2345 
49 — 200 30428 .0980 1(9—24) 528 —336 89405 .9520 
50 — 208.25 30449 .5231 1(9—25) 561 —344. 93132 .9528 


—216.6r 33843 .3333 1(9—25) 578 — 361 96082 .0856 

— 225.25 37506 .4392 1(10—26) 595 — 378.25 104845 .616 
53 — 234 39823 .0675 | 1(9—26) 630 — 387 114079 .560 
54 —242.916r 43950 .4347 | 1(10—27) 648 —404.75 124075 .662 
55 — 252 48384 1(10—27) 666 — 423 136438 .061 








*c(a—b) means that observations numbered a to b (inclusive) are to be grouped and 
multiplied by the weight c. 





P. G. GUEST 


TABLE OF WEIGHTS—Continued 


a2 = Yw.y/Twe 


50 —9.64 

44 —13.40909091 
92 —16.65217391 
76 —21.44736842 


370 — 23 .44324324 
268 — 29 .00746269 
570 —33 .46315789 
400 — 40.06 

1,078 — 47. 28571429 


352 —55. 13636364 
1,470 | —61.34285714 

472 —70. 22881356 
1,044 —73.68965517 
1,624 —80. 70689655 


1,350 —90.76 
- : -5) 2,480 —101.4419355 

11(99—11) — 6(0—5) 5,984 —112.7% 
5(10—12) ‘ 5) 3,080 —121.5181818 
11(10—12 ) 3 7,370 — 133 .8597015 


2011—13 ( j 1,452 — 146 .8305785 
13(10—13) 5) 12,428 —151.7531381 
3(11—14 y ) 3,200 —161.74 

13(11—14 8(0—-6 14,924 —175.8780488 
7(12—15 | 8,624 —190.6428571 


13(12—15) § 17 ,628 — 201 .9734513 

7(13—16 . 7 10, 136 —217.7707182 
15(13—16 ( 23 , 140 —234.1972342 
7(14—17 ( 11,760 — 246 .8714286 
3(13—17) ‘ 6,250 — 253 


8(14—18 5(1—% 17 ,680 —270.5941176 
17(14—18 10(0—8) 39,780 — 288 .8153846 
8(15—19) 5(1—8 20,240 — 302 .7086957 
17(15—19 10(0—8) 45,390 —321.9617978 
9(16—20 5(1—9) 25,320 — 341 .8440758 


17(16—20) 10(0—8) 51,340 — 357 .0821192 
3(16—21) 2(1—9 10,800 — 364.54 

19(16—21 12(0—9) 71,858 — 385 .5901639 
5(17—22 3(1—10 19,840 — 407 .2677419 
19(17—22) 12(0—9) 80,522 — 423 . 7239264 


5(18—23) 3(1—10) 22,180 — 446 .4329125 
7(18—23 4(0—10 32,466 — 469 .7710220 
11(19—24 6(1—11) 53 , 284 — 493 .7369942 
7(19—24) 4(0—10 35,994 —511.9404901 
11(19—25 7(i—11) 65,604 —520.8661972 


23 (19—25 14(0—11) 142,968 —546 
12(20—26 7(1—12 2 —571.7602740 
23 (20—26 ) 14(0—11) 157 ,458 —591.1840491 
12(21—27) — 7(1—12) 85,400 —617 .9780328 
25(21—27) — 14(0—12) 184,800 —645.4 





*c(a—b) means that observations numbered a to } (inclusive) are to be grouped and 
multiplied by the weight « 





FITTING OF POLYNOMIALS 


TABLE OF WEIGHTS—Continued 
j =8 
dy = Dwiy/Zwie® bps = da + Bess 


* 
Ws 





1(3) 

9(4) 
3(4) 
5(5) 


6(5) 
15(6) 
5(6) 
24(7) 
15(7) 
7(8) 
7(8) 
35 (9) 
20(9 
48 (10) 


27 (9—10) 
63 (10—11) 
5(10—11) 
20 (11—12) 
35(11—12) 


77 (12—13) 
44 (12—13) 
24 (13—14) 
2(13—14) 
117 (14—15) 


54(14—15) 
39 (15—16 ) 
65 (15—16) 
35 (16—17 ) 
25 (16—17) 


5(16—18) 
88 (16—18) 
11(17—19) 
44(17—19) 
64 (18—20) 


34 (18—20) 
17 (19—21) 
39 (19—21 ) 
247 (20—22) 
13 (20—22) 


247 (21—23) 
133 (21—23 ) 
56 (22—24) 
50 (22—24) 
280 (23—25) 


147 (22—25) 
105 (23—26 ) 


— 15. 83333333 
—21 
-— 27 . 16666667 


1140 —30.47368421 
3630 —37 .77272727 
1560 —44.84615385 
9204 —53.51694915 
7140 —61.94117647 


3990 —71.97368421 
5376 —78.75 
32130 — 88 .72222222 
21240 — 100.8644068 
19(2—7) 59736 —112.1946565 


19(2—7) 63612 —117.6881720 
40(2—8) 172620 —129.7846715 

3(2—8) 15540 —144.1351351 
11(2—9) 71280 — 157 .5740741 
23 (2—8) 154560 — 167.25 


48 (3—9) 378840 — 183.7195122 
25 (2—9) 244200 — 198 .7297297 
13(3—10) 147264 — 216 .5677966 

1(2—10) 13716 — 232 .9265092 
56 (8—11) 881244 — 252. 1282528 


29 (2—10) 485460 — 262. 2043011 
20(3—11) 382590 — 282 .5244648 
31(2—11) 701220 —301 .4137931 
16(3—12) 409920 — 323. 1065574 
11(3—12) 316800 —345 .7916667 


3(3—13) 93060 —352.4432624 
51(3—13) 1,768272 — 375 .9644670 
7(3—13) 256410 — 388 .0855856 
27 (3—13) 1, 102464 —412.7298851 
37 (8—14) 1,736928 — 435 .3016360 


19(3—14) 988380 —461.3137255 
9(3—15) 533052 — 485 .2363184 
20(3—15) 1 ,305720 —512.6129032 
123 (4—16) 8 ,810490 — 540 .9827586 
7 (3—15) 524160 ~- 555 .5416667 


129 (4—16) 10 578516 —585 .0301205 
66 (3—16) 6 ,091932 —611.8587896 
27 (4—17) 2,717820 —642.7178952 
23 (3—17) 2,587500 —670.8986667 
141 (4—17) 16 ,009140 — 690 .6405672 


94(4—17) 10,971492 — 703 . 9420655 
64 (4—18) 8 336160 — 733 .0989520 

165 (23—26 ) 98 (4—18) 13 ,809180 —766 .9110070 
44 (24—27) 25(4—19) 3, 907200 —797 .4189189 
92(24—27) 51(4—19) 8 , 595744 — 832 . 5982533 


ceriT ETE EHEC FEE EPRHE UPR OCE Err 











* c(a—b) means that observations numbered a to b (inclusive) are to be grouped and 
multiplied by the weight c. 





26 
27 
28 
29 
30 
31 
32 
33 
34 
35 


36 
37 
38 
39 
40 


41 
42 
43 
44 
45 


46 
47 
48 
49 
50 
51 
52 
53 
54 
55 


TABLE OF WEIGHTS—Continued 


206 (10) 
415 (10) 
284 (11) 
235 (11) 
415(12) 
217 (12) 
288 (13) 
2989 (13) 
16 (14) 
2989 (14) 
944 (15) 


535 (15) 

586 (16) 
1533 (16) 

3 (16—17 ) 
1491 (16—17) 


2240 (17—18) 
9420 (17—18) 

141 (18—19) 
11148 (18—19) 
2820 (19—20) 


13332 (19—20) 

415 (20—21) 
15620 (20—21 ) 
4779 (21—22) 


19074 (21—22) 


5535 (22—23 ) 
21945 (22—23 ) 


415(23—24) — 


8489 (23—24) 

3785 (24—25) 
30485 (24—25) 

8680 (25—26 
34645 (25—26 ) 


1967 (26—27) — 


46255 (26—27 ) 


multiplied by the weight c. 


P. G. GUEST 


J 


* 
WwW, 


5(2) 
2(3) 

46 (3) 
10(3—4) 
73 (3—4) 
15(4—5) 
53 (3—5) 
41 (4—6) 
29 (4—6) 


11(4—7) 
31(4—7) 
71(5—8) 
395 (4—7 ) 
131 (5—8) 


245 (5—8) 
161 (6—9) 
117 (5—9) 
194 (6—10) 
98 (6—10) 


115(6—11) 
1155 (6—11) 

6(7—12) 
1344 (6—11) 
410(7—12) 


221 (7—12) 
235 (8—13) 
561(7—13) 
502 (8—14) 
957 (8—14) 


1405 (9—15) 
5397 (8—15) 

79 (9—16) 
6045 (9—16) 
1765 (9—16) 


8151 (9—16) 
245 (10—17) 
9031 (10—17) 
2576 (10—18) 
9955 (10—18) 


2834 (11—19) 
10923 (11—19) 

194 (11—20) 
4667 (11—19) 
1693 (12—21) 


15249 (11—20) 
4263 (12—21) 
16549 (12—21) 
924 (13—22) 
20515 (12—22) 


t++t+ +444 


t++++ 


be 
4. 
se 
4 
4. 
<f. 
4. 
“dn 
alin 
4}. 
- 
<f. 
ah 
of. 
Jf. 
+ 
4 
+4 
she 
wie 
es 
4 
s 
A. 
4. 
4. 
“fe 
4. 
4. 
4. 
+ 
of. 
af. 
4. 
ae 


bog = Ag = Dwyy/Tu.e' 


2(0—1) 

1(1) 
14(0—1 
11(1 


50(0—1 

14(1 
58 (0—1 
32(1—2 
28 (0O—1 


12(1—2 
26 (0O—2 
70(1—2 
396 (0O—2 
106 (1—3) 


226 (0—2 
120(1—3 
100 (0—3 ) 
185(1—3) 
78 (0—3 ) 


134 (1—3) 
1126 (0—3 

5(1—4 
1450 (0—3 
379 (1—4 


226 (0—3 
206 (1—4 
532 (0—4 
497 (1—-4) 
826 (0O—4 


1071 (1—3) 
5408 (0—4 

70(1—5) 
5792 (0O—4 
1696 (1—5) 


7008 (0—5) 
226 (1—5 
7456 (0—5) 
2271 (1—6 
9354 (0—5) 


2406 (1—6) 
9894 (0O—5) 
185 (1—6 
3850 (0-—6 ) 
1560 (1—6 
14080 (0O—6 
3610 (1—7 
14800 (0—6 ) 
758 (1—7) 
17754 (0-7 ) 


5376 
3600 
39648 
12480 
84768 
90000 
89880 
54960 
200376 
613152 
4, 138824 
1,721280 


4, 182864 
3, 345552 
3,395784 
7 ,072128 
4, 241328 


6 855936 
80 ,879904 
489312 
125, 116488 
44, 279712 
28 , 38477 
34,551408 
103 , 700520 
97 , 162464 
209 ,629728 


346 ,514448 
1680 ,885360 
27 ,560016 
2408 , 213808 
763 , 126848 


3922 ,631856 
133 ,851648 
5452 ,591056 
1855 , 459872 
3071 ,065288 


2521, 801584 
10844 , 513064 
226, 96 
5386 ,084704 
2388 , 695712 


22841 ,591136 
6940 ,591392 
29764 , 941336 
1798 , 408080 
46002 , 560208 





*c(a—b) means that observations numbered a to b (inclusive) are to be grouped and 





FITTING OF POLYNOMIALS 


TABLE OF WEIGHTS—Continued 


1(3) 
15(4) 
9(4) 
49 (5) 


26 (5) 
72 (6) 
3(6) 
112(7) 
275(7) 
19(8) 
481(8) 
1755 (9) 
287 (9) 
819(10) 


112(10) 
697 (11) 
1037 (11) 
5586 (12) 
510(12) 


228 (13 
1824 (13) 
2528 (14) 
3220 (14) 

35200 (15) 


, 22149(15) 
14553 (16 ) 
29475 (16) 
34125 (17) 
10465 (17) 


3540 (18) 
1925 (18) 
36736 (19) 
110374 (19) 
12111 (20) 


69223 (20) 
524160 (21) 
17949 (21) 
18954 (22) 
6075 (22) 


104091 (23) 
8060 (23 ) 
128 (24) 
73437 (24) 
34595 (25) 


9709 (25) 

57967 (25—26 ) 
13650 (25—26 ) 
289960 (26—27 ) 
411312 (26—27 ) 


ete Peet Fie tae 4 


j= i 


bs = ds = Dwsy/Dwee® 


= 
Us 


4(2) 
49 (3) 
26 (3) 

111(4) 


55 (4) 

143 (5) 
4(4—5) 
130 (5—6) 
301 (5—6) 


20 (6—7 ) 
464 (6—7 ) 
1360 (6—8) 
213 (6—8) 
589(7—9) 


75(7—9) 
455 (8—10) 
583 (7—10) 

3059 (8—11) 
430 (8—10) 


115(9—12) 
1274(8—11) 
1701 (9—12) 
2030 (9—12) 

21518(10—13) 


13230 (10—13) 
8463 (11—14) 
16344 (11—14) 
16632 (11—15) 
4998 (11—15) 


1652 (12—16) 
861 (12—16) 
14874 (12—17) 
43890 (12—17 ) 
4719 (13—18) 


25960 (13—18) 
241736 (13—18) 
8085 (13—18) 
8385 (14—19) 
2568 (14—19) 


43290 (15—20) 
3289 (15—20) 
47 (15—21) 
26468 (15—21) 
12285 (16—22) 


3400 (16—22) 
36800 (17—23 ) 
8075 (16—23) 
168883 (17—24 ) 
236698 (17—24 ) 





5(1) 
35(1—2) 
14(1—2) 
84(1—2) 


30(1—2) 
55(1—3) 
3(1—3) 
143 (2—3) 
231 (1—3) 


13 (2—4) 
364 (1—3) 
1547 (2—4) 
189 (1—4) 
456 (2—5) 


68 (1—4) 


( 
18183 (2—7 


10235 (2—7 ) 
5735(2—8) 
12800 (2—7 ) 
15125 (2—8) 
4199 (2—8) 


1239 (2—9 ) 
732 (2—8) 
14245 (2—9) 
39121 (2—9) 
3809 (2—10) 


23405 (2—9 ) 
229395 (3—10) 
6944 (2—10) 
6794 (3—11) 
2233 (2—10) 


35445 (3—11) 
2461 (2—11) 
47 (3—11) 
24192 (2—11) 
10619 (3—12) 


2793 (3—12) 
30355 (3—12) 
7514 (3—12) 
+ 144768 (3—13) 
+ 193397 (3—13) 


HE FEEHEH FEEEHE FEEHEE FEEEE FEF4EHE t4Htt+ FHttt +4444 +444 


240 

6720 
6720 
65520 
51840 
208560 
15120 
840840 
2, 808960 
252720 
8,910720 
47 ,895120 
9 ,954000 


35, 112000 


6 , 283200 
47 , 295360 
95 ,729040 

615, 593160 
92 ,085120 


37 ,044000 
485 , 503200 
788 , 492880 

1202 ,742240 
15207 , 265920 


10916 ,650080 
8211 ,661920 
19440 ,662400 
27440 ,028000 
9460 , 956960 


3608 , 463600 
2257 , 995600 
51234, 261120 
171065 ,664000 
20909 , 168280 


135787 , 454880 
1, 379383 , 716480 
52229 , 469600 
60473 ,424960 
21840 , 477120 


408520 , 153080 
34633 , 959360 
639 , 576000 
400419 , 714240 
204535 , 553040 


62103 , 392400 
696524 ,337600 
185253 , 868800 

4 , 267843 ,419840 
6, 521985, 180480 





*c(a—b) means that observations numbered a to b (inclusive) are to be grouped and 


multiplied by the weight c. 





P. G. GUEST 


MODEL FORM 1 


605.48 687. 
561.83 725. 
516.42 761. 
469.22 795. 
420.29 827. 
369 .60 857. 
317.17 885. 
2928 .93 263 .06 911. 


207 .40 935.95 

4465 .36 149.98 957 .86 
91.05 977.79 

6317 .05 30.52 995.47 
4002 .02 10319 .07 


of Wh - © 


D1 mS 


MODEL FORM 2* 


2(0—12)/265 a 598.735200 
+ 


Ba — 52 Baa: +48.492011 
Bao + 2103 .90448 _ Bots 0.029212 


647 256423 





2{11(10—12) — 6(0—5)}/7370 —0.9325387 
E Bu2 — 188.8597018 —0.0018586 


—0.9343973 





2{217(12— ) — 98(6—10) + 78(0—$)}/4,241328 
+ 





0.0000138848 


21(6—12)/136 40. 448824 
ia Bau — 88 +0.385928 
Ba + 


40834752 


2{35(11—12) — 23(2—8)}/154560 3 —0.004385546 
e Bss — 


—0.004385546 


Pe = oe 








as = bs 


ee ee 


*c(a—b) means that observations numbered a to 6 (inclusive) are to be grouped and 
multiplied by the weight c. 


If a calculating machine is not available, it is best to work out the sums 
(a—b) individually. The product w(a—b) should be multiplied out in full, but 





FITTING OF POLYNOMIALS 547 


seven-figure logarithms may be used for the division by 2w,e’ and for the cal- 
culation of the terms @; a; . 
The values obtained for the polynomial coefficients by the method of least 


squares and by the method of grouping are shown below, together with the 
standard errors. 


Least Squares Grouping 


bo X 105 647254.3 + 12 647256 .4 
by, X 108 40834.6 + 2.2 40834 .8 
be X 104 —9343.5+ 5 —9344.0 
bs X 105 —437.6+ 2.2 — 438.6 
b, X 108 13.82 3.6 13.9 


When the polynomial coefficients have been determined, the fitted values 
may be worked out. If the polynomial is required in terms of a variable other 
than e, a Horner shift is performed in the usual way. 

If the standard errors are required, the residuals v, must be calculated. As- 
suming an efficiency of 90%, the estimated standard error of an observation is 
given by 


s» = [Zv2/{n — 0.9(p + 1)}]'. 


The estimated errors of the polynomial coefficients and fitted values can be 
found by using the tabulated weight functions [7] for the least squares solution, 
multiplied by the factor 1.05 to allow for the efficiency of 90%. 

It is sometimes necessary to know whether the neglect of higher powers is 
justified. The quantities a/’=7} provide a test for determining the degree of the 
polynomial to be used, since a;'=7j is the amount by which Sv" is reduced when 
the degree is increased from j — 1 to j in the least squares method. To a suf- 
ficiently good approximation we can use a; for a; and put 27} = n™*"/«,, 
where 


ki = 12, xe = 180, x; = 2800, xs = 44,000, xe = 700,000. 
In the example used here we find that 


aj=T; = 41; ai=Ti = 0.016; as=7T; = 0,009. 


Thus a; is highly significant, a, is of doubtful significance, while terms of higher 
degree are probably insignificant. 


REFERENCES 


[1] R. A. Fisner anv F. Yates, Statistical Tables for Biological, Agricultural, and Medical 
Research, 3rd ed., Oliver and Boyd, Ltd., Edinburgh, 1948. 

[2] R. L. ANDERSON AND E. E. Houseman, ‘‘Tables of orthogonal polynomial values ex- 
tended ton = 104,’’ Research Bulletin 297 (1942), Iowa State College Agricultural 
Experimental Station. 

[3] R. T. Brree, ‘Least-squares fitting of data by means of polynomials,’’ Rev. Modern 
Physics, Vol. 19 (1947), pp. 298-360. 





548 P. G. GUEST 


[4] Harop Jerrreys, Theory of Probability, 2nd ed., Clarendon Press, Oxford, 1948, p. 179. 

[5] P. G. Gugst, “Orthogonal polynomials in the least-squares fitting of observations,” 
Philos. Mag. (7), Vol. 41 (1950), pp. 124-137. 

[6] R. T. Brree anv J. D. Suwa, ‘‘A rapid method for calculating the least squares solution 


of a polynomial of any degree,’’ Univ. California Publ. Math., Vol. 2 (1927), pp. 
67-118. 


[7] P. G. Guxst, ‘Estimation of the error at a point on a least-squares curve,’’ Australian 
Jour. Sci. Research. Ser. A, Vol. 3 (1950), pp. 173-182; ‘‘Estimation of the errors 
of the least-squares polynomial coefficients,”’ ibid, pp. 364-375. 





A MULTIVARIATE GAMMA-TYPE DISTRIBUTION: 


By A. S. KrisHNAMOORTHY AND M. PARTHASARATHY 


Madras Christian College, Tambaram, India, and 
Ramanujan Institute of Mathematics, Madras, India 


Introduction. Mehler has shown that the two-variate probability density 
function (pdf) for correlated variates, each of which has a marginal Guassian 
distribution, can be expressed as a series bilinear in Hermite polynomials: 

1 9 9 
Oe VT xP LHe — ory + y'/( — 0} 


2 
= x exp {—3(z° + y’)} E + pH,(x)Hi(y) + & Hila) Holy) + |. 


Kibble [5] has extended this result to any number of variables and noticed a 
small difference between the generalization and the particular case due to Mehler. 

It is known that Mehler’s series is not an isolated result, there being a similar 
series bilinear in Laguerre polynomials, discussed by Hardy [3], Watson [6], 
and Kibble [4], and series bilinear in certain other other polynomials, discussed 
by Campbell [2], and by Aitken and Gonin [1]. All these results can be generalized 
for any number of variables in much the same way as Kibble has generalized 
Mehler’s result. These generalizations are contained in Krishnamoorthy’s 
thesis ‘‘Multivariate Distribution Functions” (in the library of the University 
of Madras). In the present paper the generalization involving Laguerre poly- 
nomials is given. 


1. Notation and summary. It was shown by Kibble [4] that a two-variate 
distribution function, in which each of the variates z;, 7 = 1, 2, has the fre- 
quney function 

p-l —zy 


te 
(1.1) $(z,) , I'(p) ’ 


may be represented by 


a 


$(21) (x2) E + & Ia(n, p)La(te, p) + ipo +1) La(x1, p)L2(a2, p) + of, 


where L,(x, p), p > 0, is the generalized Laguerre polynomial of degree r satis- 
fying 


i 
sc —<) t2"ol2)] 
1.2 Lz, p) = riLe”” (x) = ~———————__.. 
( ) 7 Pp ° o(x) 
1 Sections 1 to 4 of this paper, deriving the distributions, were written by the first author; 
Section 5, on the convergence of certain series, was contributed by the second author. 
549 





550 A. S. KRISHNAMOORTHY AND M. PARTHASARATHY 


It is the object of this paper to extend Kibble’s result to n variables, assuming 
(i) that the variates have each a marginal Gamma-type distribution given by 
(1.1) with the same parameter p; (ii) that the variates have Gamma-type dis- 
tributions with different parameters. The extension in case (i) appears in (3.7) 
and that in case (ii) appears in (4.1). The convergence of series obtained in 
either case is established in Section 5. 

An outline of the procedure followed may be given thus. We obtain, in (2.2), 
the moment-generating function (mgf) for the joint distribution of &; = 
Art, @ = 1, 2,---,n), where each z; has a normal distribution with zero 
mean. From this we get the mgf for the distribution of the sums of squares in 
a sample of m from a normal correlated n-variate distribution, and thence, in 
(2.3), a possible mgf for an n-variate distribution in which each variate has a 
Gamma-type distribution. Finally we obtain from (2.3) the n-variate distribu- 
tion in (3.7) by a process which is essentially the inversion of the Laplace trans- 
form, 


(1.3) = [ e f.(x, p) dz, 
1 0 
where 


f(a, p) = opie 


It will be noticed that (1.3) is true for r = 0 if we define (as we may) [p(x, p) = 1, 
p =1. 


, p= plpt+l)---(ptr—D. 


2. An mgf for an n-variate Gamma-type distribution. Let || p;;|| defined by 


Pin 


P2n 


Pnn 


where pj; = pji, be a positive definite matrix. Then the normal correlated 
n-variate distribution having zero means and || p;; |, for its variance-covariance 
matrix is given by 

n \ 


(2.1) 7, exp{ —3 > px. xjp drs dx, +--+ dx, 
/ 


\ t,j=l 
in the usual notation, where || p’’ || is the inverse of || p;; || . Denoting the mgf 


for a distribution of &;,7 = 1, 2, --- , n, having any pdf, by 


Gla, a2, +++, Qn) = E {exp Dd. acti}, 


we find that, in the case £; = }2; where 2; have the joint distribution (2.1), 
the mgf is 





MULTIVARIATE GAMMA-TYPE DISTRIBUTION 


Gylor, a2, +++, an) = E fexp $ >> a2} 


ij i : 
= a [ exp {—4 >. (p'? — ;; a:)a,x;} dz, 


where —© < 4% < 0,4 = 1,2,:--- ,n, / --+ d% denotes an integration with 
respect to all x;, and 6;; is the Kronecker delta defined as zero if i # j and 
unity otherwise. Therefore 

(2.2) Gy(ay , a2, *** 5 an) = | p” (? |p? — bisa, ie = | bij — pizai , 


provided that || p"? — 5;;a; || is positive definite. It follows that the mgf for the 
distribution of the sums of squares in a sample of m from a normal correlated 
n-variate distribution is obtained by raising the expression in (2.2) to the mth 
power, and furthermore, that the replacement of m/2 by p leads to a possible 
mgf for an n-variate distribution in which each variate x; has the frequency func- 
tion (1.1). Therefore a possible mgf for an n-variate Gamma-type distribution 
defined as above is obtained from (2.2) when —} in the power on the right side of 
(2.2) is changed to —p. The expression for the mgf is 
Gy(ar , aa, +++ , On) = | 845 — pigs |” 


l—a —pra: -*+ —pinan |” 


—pe2am lLl— az +++ —pandn 


8 — pir Be 


— pir Bi 1 
= {(1 — a:)(1 — a) --- (1 — an)}~” ‘ 


—pinBi —pinB2 °-- 1 


where 8; = a;/(1 — a), i = 1, 2, --- , n. It is convenient to write (2.3) in the 
form 


Gay , a2, *** 5 On) 
= {(1 aa a)(1 eu a2) yah (1 os an) }~"{g(B: , Be, ro »B); 
where g(8; , 82 , --- , Bn) is the determinant of §’s in (2.3). 


(2.4) 


3. A series for an n-variate Gamma-type distribution. Expanding the g in 
(2.4) by Maclaurin’s theorem for a function of n variables, we get 





A. 8S. KRISHNAMOORTHY AND M. PARTHASARATHY 


0 
q(Bi, Be, -**, Ba) = got (re. 2) 0 
i OB; 


a {2} ‘ [n] 
++(ra3) gJote-: + (xs. 2) go; 


OB; ni\ 4 0B; 

where go = g(0, 0, --- , 0) and (3; 8;(8/98;)) "go is the result of first expanding 
(= 8;(8/08,))" regarding the operators 0/08; as algebraical numbers, then giving the 
operators their proper roles in the expanded form of (2 8;(0/08;))’, and finally 
putting 6: = B = --- = 6, = O in the partial derivatives of g which we get 
when the expanded form is applied to g. 

Clearly the expansion of g(: , 82, --* , Bn) in (3.1) does not contain terms 
linear in the 6’s or terms such as 67! --- B?* --- with any p; > 1. In fact 


(3.1) 


a 


Ogo _ 


Op, 


—mn0--- 1 


where, of course, the partial derivation is performed before we put 8; = 62 = 
-++ = 8, = 0; and similarly 090/082 , 0go/083, --- , 890/08, are all zero, so that 
(Zf-1 0/d8;)go = 0. 
Further 
—gn0--- © 
. a fer 
0 Jo ‘ 

OB OB. 


3° go ee 
0B; OB2 OB; 


a" Jo 
8B: OB: «~~ OBn 





MULTIVARIATE GAMMA-TYPE DISTRIBUTION 
Hence (3.1) can be written 


g(Bi, Be, -*+, Bn) =~1— ( VCeB B+ DY Cin Bi Bi Br 
(3.2) i<j i<j<k 
+ -:- + Ci2a---n Bi Be ae Ba =1-B say. 
Using (3.2) in (2.4) and expanding (1 — B)~” formally by the binomial theorem, 
we get 
Gy(ar, a2, +++ , an) = {(1 — a)(1 — ae) «++ (1 — an)} 
(3.3) oo 7) 
SS ppt) --- (ptr—-1) ‘ _ = ppt) ---(p+r—1) ps 
(e ea See Ee 


where 
BY = {(1 — a)(1 — ag) «++ (1 — an)} 7B’. 


Expanding B’ by the multinomial theorem, we can express B? as a finite sum 
of the form 


Bt = II (1 = a)? (COOKS BP «+. 6) 
- Exif - «(-25)"} 


where K is a polynomial in C;;,---, Ciz...., amd a, --* , @ are nonnegative 
integers of which not more than n — 2 are zero. 

It is now plain from (3.4) and (1.3) that B? can be expressed as a Laplace 
transform: 


(3.4) 


B = [ gers >K II Sa (x5, p) di, 
0se;50 t=1 


of which the determining function is 


5) OK TD fa (as, p) = oadoled +++ (2) DK TT ee? 


(3.6) = $2) b(a9) «++ $2.) { F 0, LtieP) Las.) 
i<j Pp Pp 


$+ 4 Cm eee, Leap, 
Pp p 


where {---}" is a symbol for the rth power of a multinomial, in expanding which 


we suppose that 
{L(z, p\" fen m\ ce (ee a" 
io ws P P 





554 A. S. KRISHNAMOORTHY AND M. PARTHASARATHY 


for all positive integers m, n, and after expanding which we set 


L(z, p)\™ _ L(x, p) 

~p fT pe 
Finally we can replace B? in the series of (3.3) by its determining function in 
(3.6) and obtain the form 


oldoled) --+ (20) . 1 c,, Mess) L(as,P) 


(3.7) i: . . 


- s° 

where ¢(z,) is defined by (1.1), for a distribution function having G,(a; , a , 

, a) defined as in (2.3) for its mgf. The convergence of series (3.7) is proved, 

with a certain restriction on the p’s, in Section 5. Consequently, with this restric- 

tion as regards convergence, we can take (3.7) to be an n-variate distribution func- 

tion in which each variate x; ,1 = 1, 2, --- , n has the distribution function $(2;) 
in (1.1). 

Remark on the series (3.7). If there are only two @’s present in any term of 
(3.4), this being their least number possible, they will be raised to the same 
degree r, and therefore the corresponding term of (3.5) will have Laguerre 
polynomials of the same degree r. If, however, more than two §’s are present 
in a term of (3.4), their degrees may be different and consequently also the de- 
grees of the Laguerre polynomials in the corresponding term of (3.5). Hence 
the n-variate Gamma-type distribution symbolically denoted by (3.7) has the 
property that (i) any term in its expansion involving two variables contains 
Laguerre polynomials of the same degree in those variables, while (ii) a term 
involving more than two variables may contain Laguerre polynomials of dif- 
ferent degrees in the variables. It is known [5] that an analogous property is 
possessed by the extension to n variates of Mehler’s series in Hermite poly- 
nomials. 


Me sas & Crnon EP ooh L(xn, p)\" 


4. A generalization of Section 3. If we take instead of the mgf in (2.4) the 
more general mgf 


(1 — a) ?*4(1 — ae)? «++ (1 — an) *™{g(Bi , Be, --* » Bn) }™” 


and repeat the reasoning of Section 3, we shall obtain, in the symbolic notation 
of (3.7), the following series (whose convergence is established in Section 5 under 
the condition on the p’s already referred to): 


2 (r) 
(21) (22) +++ (0) Sey, C,, Ete Bd) Las, p) 


t<j Pi Pi 


a ex- + a L(x1, pr) a L(z., e 
Pi Pra 


(4.1) 





MULTIVARIATE GAMMA-TYPE DISTRIBUTION 555 


Pi-l zi 
xi* e : 
(4.2) (zx) = o= i, 2, reeyn. 
I'(pi) : 
This series, under the condition which secures its convergence, may be regarded as 
an n-variate distribution function in which each variate x; , 1 = 1, 2, --- ,n has 
ihe distribution function $(x;) of (4.2).? 


5. Addendum: the convergence of the series in (3.7) and (4.1). The object 
of this addendum is to establish, under a suitable condition, the convergence 
of the series in (3.7) and (4.1) The proof of the convergence depends on the fol- 
lowing lemma. 


Lemma. In the symbolic notation of (3.6), forr > 1 
| {L(a, BI — | L(x, p)| — [K(e, pri 0<p 
p Ji | p® |~\K,p), p 
where K(x, p) is a constant depending on x and p. 
Proor. From the well known result I(x + a)/I'(x) ~ 2* as x — «©, where 
a is a constant, we get 


< 
>}? 


(r) 


en 2 CR se Dy Bee 

rae r! lpr + 1 
From a formula of Fejér [7], Hille [8] has deduced that 

LAx,p)  _1 


dr —i(p—h) A(p—D 1 
—, 2 om a 
(5.2) r! mF ex r cos | V/rz — * G 

+0 [ro r— o. 
Combining (5.2) with (5.1), we conclude that 

| ! 
(5.3) | Lele, w) | | Lele Biel) < Ace, pyro, 
| p* | p "/r! 


where A(z, p) is a constant which depends on x and p. Further, once 7» is fixed, 


| 
| 


r>*M?o, 


L,(z, p) 


p” 


(5.4) < B(z, p), r<fro, 


where B(x, p) is also a constant which depends on z and p. Equations (5.3) 
and (5.4) together yield the result stated in the lemma where K = max (A, B). 
Tueorem. The series in (3.7), 


pF L(x;, p) L(2;, p) 
ae le Ci ‘> on “do 
r! ~ p p 


ie ose She L(x, p) ||, Lien Bi 
p p js 


is absolutely convergent provided that 


* Thanks are due to Dr. P. Kes»va Menon and Prof. C. T. Rajagopal for helping to settle 
certain points of detail. 





. KRISHNAMOORTHY AND M. PARTHASARATHY 
5.5) o 2 Cs + 2, | Cow | + see + | Cisne | <1. 
Proor. We have, in symbolic notation, 
i Bheed ins {eu L(x, p) Lear 2\ {Cer xi p) La; p)\"*" 
Pp Pp Pp Posy 


\) my 
‘3 {Crt L(x, p) lori aa P)\ 


where one at least of the suffixes 7’, 7’ is different from 7, 7 (similar statements 
being true of the C’s with 3, 4, --- suffixes), and 


(5.6) 


> 


, 


rr 


Ams.ms,:--m, = me + ms + +++ +m, =F. 


First suppose that p > 4. Then (5.6) gives, by virtue of the lemma, 
| te | < DAme,mg,---,myl (tr, p)K (a2, p) 
+++ K(an, p) | Cag ["* | Cry |" +> 


(5.7) 


Therefore, writing 
x = max {K(x, p), K(a2, p), -*+ , K(an, p)} 
we get from (5.7) 
| tr | < Kk" Ams, ma 0° 4M | OFF re | Cir" nm roe | Ciz3-.-n |" 
= «(| Cis | + 2| Cis | + eee | Crp...n |)” So’. 


And so (p"/r!) | t. | < up = (p/r!)x"o", where Du, is known to be convergent 
for ¢ < 1, and hence Zp"t,/r! is absolutely convergent for ¢ < 1. 
In the case p < 3, it is obvious from the lemma that 
»” e n iG—p)n + 
— | = —k«'r v, 


rT! 


where 


(r) l/r i(i—p)n 
) 
| "a || o—>oasr— @, 
r! 


Consequently, by Cauchy’s root-test, Zv, is convergent for ¢ < 1, and so again 
=p't,/r! is absolutely convergent for « < 1. 

A sufficient condition for the convergence of the series in the theorem, simpler in 
form than (5.5), is 


(5.8) Ne? < 1, 


where N is the result of replacing every one of the p’s in the C’s by unity and p* is 
the maximum of the terms in the p’s when we omit the numerical coefficients of the 
terms. 





MULTIVARIATE GAMMA-TYPE DISTRIBUTION 557 


A sufficient condition for the convergence of the series (4.1) is again either (5.5) 
or (5.8) since, arguing exactly as above, we find that 


n (r) 
| the (r + 1)" term of the series (4.1) | < J] ¢(z). c- qrpritare) gr 
t=_1 é 


where the summation in the power of r is for all p; which are less than }. 

Note. The case n = 2 makes the series in the theorem identical with a series 
obtained by W. F. Kibble [4] for a two-variate Gamma-type distribution. 
Kibble’s proof of the convergence is, however defective’ since he assumes that 

L(x, p) _ Lr (x, p) 


neues eee — © 
p” perv ’ : 


is a consequence of (5.2). 


REFERENCES 

{1] A. C. A1rrken anv H. T. Gontn, “On fourfold sampling with or without replacement,”’ 
Proc. Roy. Soc. Edinburgh. Sect. A, Vol. 55 (1935), pp. 114-125. 

[2] J. T. Camppe.y, ‘‘The Poisson correlation function,’’ Proc. Edinburgh Math. Soc. 
Series 2, Vol. 4 (1934), pp. 18-26. 

[3] G. H. Harpy, ‘“‘SSummation of series of polynomials of Laguerre,’’ Jour. London Math. 
Soc., Vol. 7 (1932), pp. 138-140. 

[4] W. F. Kisses, “A two-variate Gamma-type distribution,’’ Sankhyd, Vol. 5 (1941), 
pp. 137-150. 

[5] W. F. Krpsie, ‘‘An extension of a theorem of Mehler on Hermite polynomials,’’ Proc. 
Cambridge Philos. Soc., Vol. 41 (1945), pp. 12-15. 

[6] G. N. Watson, ‘‘Notes on the generating functions of polynomials: (1) Laguerre poly- 
nomials,’’ Jour. London Math. Soc., Vol. 8 (1933), pp. 189-192. 

[7] L. Fesér. ‘Sur une méthode de M. Darboux,”’ C. R. Acad. Sci. Paris, Vol. 147 (1908), 
pp. 1040-1042. 

(S)JE. Hiiue, ‘On Laguerre’s series. First note,’’ Proc. Nat. Acad. Sci., Vol. 12 (1926), 
pp. 261-265. 


3 Acknowledgement is due to Prof. C. T. Rajagopal for having drawn attention to this 
defect and suggested a method of removing it. 





A COMBINATORIAL CENTRAL LIMIT THEOREM! 


By Wassity HoEFFpINnG 
Institute of Statistics, University of North Carolina 


1. Summary. Let (Yni,--:, Yan) be a random vector which takes on the 
n! permutations of (1, --- , n) with equal probabilities. Let c,(i, 7), 7,7 = 1,--:, 
n, be n°” real numbers. Sufficient conditions for the asymptotic normality of 


n 


S, = 2 Cali, Y ni) 
t=1 
are given (Theorem 3). For the special case c,(i, 7) = an(t)b,(j) a stronger version 
of a theorem of Wald, Wolfowitz and Noether is obtained (Theorem 4). A con- 
dition of Noether is simplified (Theorem 1). 


2. Introduction and statement of results. An example of what is here meant by 
a combinatorial central limit theorem is a solution of the following problem. 
For every positive integer n there are given 2n real numbers a,(z), b,(7), 7 = 1, 
-- , n. It is assumed that the a,(z) are not all equal and the b,(7) are not all 
equal. Let (Yn, ---, Yan) be a random vector which takes on the n! permu- 
tations of (1, --- , n) with equal probabilities 1/n!. Under what conditions is 


n 


(1) Sn = Do an(i)bn(Ynid 


t=] 
asymptotically normally distributed as n — x? 
Throughout this paper a random variable S, will be cailed asymptotically 
normal or asymptotically normally distributed if 


a 1 ee 
lim Pr{S, — ES, < x VvarS,} = Ta [ e dy, —-x <4r< 0, 
n—0 Vv aT Jz 


where ES, and var S, are the mean and the variance of S, . 

In the particular case a,(i) = b,(i) = 7 the asymptotic normality of S, was 
proved by Hotelling and Pabst [2]. The first general result is due to Wald and 
Wolfowitz [6], who showed that S, is asymptotically normal if, as n — ©, 


l S* (a,(i) — a)” 
(2) stn nmcsinetiomtinas ts QM), 


E (a, (i) — a,)*| ; 
i=l 4 


n 


n 


LS (i) — 5,)' 
(3) tr... ——, = 0(1), 


|; s 4.0 — 5] 


i j=l 
1 Work done under the sponsorship of the Office of Naval Research. 


558 





COMBINATORIAL CENTRAL LIMIT THEOREM 


where 


= — = a,(i), -15 ba(i). 


i=l 


Noether [5] proved that condition (3) can be replaced by the weaker condition 


> (b, (2) = bn)” 
(4) im ~———————__=, = 0, r= 3,4,---. 


st is (b,(6) — 6" ; 


This condition can be simplified as follows. 
THEOREM 1. Condition (4) is equivalent to either of the following two conditions: 


> | bali) — Bs I 
(5) lim ——____{] = 0 for some r > 2; 


“a 13 (bali) — 5.) rT 


max (bn (2) . b,)° 
i ‘S = 


= DY (a) — 6,)° 


t=1 


Hence conditions (2) and (5) or (2) and (6) are sufficient for the asymptotic 


normality of (1). 
The proof is given in Secticn 3. For a more general condition anc a stronger but 
simpler condition see Theorem 4 below. 


One extension of this problem was considered by Daniels [1], who studied the 
asymptotic distribution of 


> . Gn(i, j)bn(Yni, Yni)- 
t=—1 j=l 

The present paper is concerned with an alternative extension. It considers the 
distribution of 


(7) Sa = Deal’, Yad 


where c,(i, j), i, j = 1,---, n, are n” real numbers, defined for every positive 
integer n. In the particular case c,(i, 7) = an(t)ba(j), (7) reduces to (1). 
Let 


®) 24) - ald - >» led — = 1 > cal h) += 5 > ca(g, h). 


g= 


THEOREM 2. The mean and variance of 


= De (i, 





WASSILY HOEFFDING 


> p> en(t, J), 
(10) var S, = a > a n(t, j). 


t=] jel 
Henceforth we assume that d,(z, 7) + 0 for some (i, 7), so that var S, > 0. 
In the special case c,(7, 7) = an(2)b,(j) this corresponds to the assumption that 
the a,(i) are not all equal and the 5,(j) are not all equal. 
Txeorem 3. The distribution of S, = >> 1 cali, Yai) is asymptotically normal if 
LS yal, 5) 
(11) yn _* Sey pwRE->-., 


ot n 7/2 
: Aaya (3 | 


tml jel 
Condition (11) is satisfied if 


max d;(i, j) 
(12) lim ~==/5"___ = Q, 


ob re (7, j) 


n tel j=l 


Theorems 2 and 3 will be a ed in Sections 4 and 5. 

For the special case c,(7, 7) = a,(¢)b,(j), Theorem 3 immediately gives 

TuEorem 4. The distribution of S, = Soi: a,(i)ba(Yni) is asymptotically 
normal if 


> (an(i) — da)" : (b,(i) — b,)" 

i=l =1 nat 

— bP (a,(t) — a,)*| TE 00 (b,(t) — 5 - 
i=] i=1 


Condition (13) is satisfied if 


(13) lim n*”* 


max (a,(i) — d,)° ymax (bs () - — 6,)° 


° 1<i< 
(14) lim n ——-— — — 


a . « \2 ? 
> (an(i) — dp) Ea b,(i) — by) 
i= 1 


It will be observed that the symmetrical condition (13) contains Noether’s 
condition (2) and (4) as a special case. 

Let X, = (Xn, --- , Xn) be independent of and have the same distribution 
06 Fi Mita, °* 5-2 ae): 


THeorem 5. The random variable 


(15) Si = Dd en(Xni, Yu) 


i=1 





has the same distribution as S,, in (7). 





COMBINATORIAL CENTRAL LIMIT THEOREM 561 


In fact, the conditional distribution of S’, given that X, = p, a fixed permu- 
tation of (1,--- , m), is independent of p because the distribution of Y, is in- 
variant under permutations of its components. 

The distribution of sums of the form (1) has attracted the attention of statis- 
ticians in connection with nonparametric tests (see, for example, [2], [6], [3]) 
and sampling from a finite population (which leads to the case a,(i) = 0 for 
t > m; cf. also Madow [4]). More general sums of the form (7) or (15) are like- 
wise of interest in nonparametric theory. Thus it follows from results of Lehmann 
and Stein [3] that a test of the hypothesis that U, , --- , U, are independent and 
identically distributed, which is most powerful similar against the alternative 
that the joint frequency function is fi(w#4) ---+ fr(un) is based on a statistic of the 
form (7) with 

en(t, j) = log fi(us), 
where the u; are the observed sample values. If the n pairs (U;, Vi),---, 
(U,, Vn) are independent and identically distributed, a test of the hypothesis 
that U; and V; are independent which is most powerful similar against the 
alternative that their joint frequency function is f(u, v) is based on a statistic of 
the form (15) with c,(7, 7) = log f(u;, vj), where (tw, 1), «++ , (Un, Un) are the 
observed values. 

In these examples the numbers c,(z, 7) are random variables. An application 
of some of the present results to such cases will be considered by the author in a 
forthcoming paper. 


3. Proof of Theorem 1. Let 


bali) — Be 
n 1/2) 
bP (b,(i) — 5) 


G, = max (|g:|,+--,|gn|). 
Theorem 1 asserts the equivalence of the three relations 


js = 


(16) lim >> gi = 0, 


no i=l] 


(17) lim >> lol’ =0 for some r > 2; 


no i=l 
(18) lim G, = 0. 


We have 


and hence for r > 2 


n n 

oa r yr—2 2 r—2 

G <= Digs! < G, ~ gi = In + 
i=1 


t=1 


The equivalence of (16), (17) and (18) follows immediately. 





562 WASSILY HOEFFDING 


4. Proof of Theorem 2. The subscript n in Y,;, en(z, 7), ete., will henceforth 
be omitted. We note that if the subscripts 7, +--+ , 7m are distinct, the expected 
value of a function f(Y;,,--- , Yi,,) is equal to 


n(n —1)---mMm—m+l)4j 2 Lary +++ dm), 


where the sum 2’ is extended over all m-tuples os ,*** 5 jm) of distinct integers 
from 1 to n. Relation (9) follows immediately. 
Let 


(19) T, = > d(i, Y3), 


where d(i, 7) = d,(t, j) is defined by (8). Using (9), we get 
(20) Te = Sa —_ ES, e 
Also 


(21) . d(i, j) = 0 for all j, > di, j) = 0 for all 7. 
j=l 
Hence 
Ed(i, Y)) = 0, 
Ed'(i, Y) =1 > d’(i, j), 
N j=l 
and if t ¥ J, 


Ed(i, Y)d(j, Y;) = cra 2 d(i, g)d(j, h) 


>? d(t, g)d(j, g). 


Therefore 


var S, = var T, = >, Ed’(i, Yi) + 2 Edti, Y)d(j, Y;) 


t=—1 


yy di - > LY’ di, gal, 9) 


i=1 jel -— 1) g=l i,j 


Led + oS. te 


n cin a ,— 1) g=1 i=l 
which gives relation (10). 

5. Proof of Theorem 3. Let 
(22) M,.n = - 





COMBINATORIAL CENTRAL LIMIT THEOREM 


(23) Ra * > >d | a, 9) I", 
‘ 1 


tml j= 


(24) D, = max |d(i,j)|. 


lsijsn 


Then var S, = n/(n—1) Mz,,. Since, by hypothesis, var S, > 0, we may and 
shall assume that 


(25) M;,. = 1. 
Conditions (11) and (12) can now be written as 


(26) lim M,,n = 0, 


neo 


and 


(27) lim D, = 0. 


That (27) implies (26) is seen from the inequalities 
| M,.n| < Mi. S Do? Mo. = De forr > 2. 
Since 
Mins.» S Mun Morse; ¢ = 1,2,--- 
condition (26) implies 


(28) lim M,,. = 0, r = 3,4,->-. 

As var S, — 1, it is now sufficient to demonstrate that under conditions (25) 
and (28), 7, = S, — ES, has a normal limiting distribution with mean 0 and 
variance 1. This will be proved by showing that 


(1B ee = 1) if r is even, 
(29) lim ET, = lo ie iniciali 
if r is odd. 


n-?e 


The rth moment of T,, 


(30) ET, = E > +++ D dls, Yi.) «++ dae, Ys,)s 


t)=1 tp=l 
can be written as a sum of terms of the form 
(31) I(r, €1y***y €m) = : rd Ed'‘(i;, Y;,) BS Ges Y..)) 
attiz 


where e; > 1, €: + --+ + @m = r. The number of terms (31) is independent of n. 
It will be shown that 


(32) lim I(r, €1,°**,€m) = 0 unless r = 2m, 
no 


(33) lim I(r, 2,-++,2) =1 


no 





564 WASSILY HOEFFDING 


and that the number of terms /(r, 2, --- , 2) in (30) with r even equals 1-3 --- 
(r — 1). Then (29) holds, and the theorem will be proved. 
We have forn —> ~ 


(34) I(r, €1,°+*5@m) wn” 2 a (i; , ji) +++ 0 (tm y Jm)- 
im 


a 
Spee eign Fae ee 


The right-hand side can be written as a sum of terms which, apart from the sign, 
are of the form 


nm" I(r, Dy Gy Cry *** » Cm) 


we eee ee Wey Re Be O(c, jay) °° A (hem  Jam)s 


‘j= ip=l ji—1 jq=l 


(35) 


where 
l<p<m 1<q<m, 
l<co<p, lcd, <q, (g,h = 1,---,m), 


and for every integer u, 1 < u < p(l < u < q) at least one c,(d,) is equal to u. 
The number of terms (35) is independent of n. 
The sum J in (35) can be written as a product of s > 1 sums of a similar form, 


* 
(36) J(r, Ps q C15 *** 5 Cm) = IT Jin, Pk> Wks Cer, *** » Chma)s 


where 
Co 
are s disjoint subsets of (e€;,-+- , €m), 
Cnr +++ + Cem, = Tr, Ce oe 
(37) Pit +++ + P= P, a 
m + +++ +m, = m. 
We observe that 


(38) lin sm, lig < m, me < TE. 


It will be assumed that s is the greatest possible number of factors into which 
J(r, p, 4; €1, °** » @m) can be decomposed in the form (36). If s = 1, the number 
of equalities between the subscripts c or between the subscripts d in (35) must 
be at least m — 1. The total number of subscripts c, d being 2m, there are at 
most m + 1 distinct subscripts, so that p + q < m+ 1. If 


(39) (cy , dg) = (cn, dr) for some (g, h), g  h, 
we have strict inequality. For an arbitrary s we have in a similar way 


(40) Pe + Qe Sm + 1, k=l,---,8, 





COMBINATORIAL CENTRAL LIMIT THEOREM 


and hence 


(41) p+qim-+s, 
with strict inequality in the case (39). 
By Hdlder’s inequality, from (35), 


| I(r, Py Qs Cry *** » Cm) | <TI(X- “ge ae" 2 (ic, » ja.) |")**"” 


go=l %) 


= I (n?* H,,.)°0" = nn? Mn. 
o=l 
Similarly, 
| J (ri, > Pes Qk, Ckiy*** Cems) | < grr" ° 
Hence, by (36), 


(42) n”™ | J(r, P,Q, is °*** » €m) | < or as ae M,,.n . 
If, for some k, r, = 1, then, by (38) and (37), p, = % = mM = a = 1, and hence 


J = 0 by (21). Thus we may assume rn > 2, k= il,---, 8. Then, by (28), 


Mon ->» M,,, — O unless 7; = --- = r, = 2. It now follows from (42) and 
(41) that 


(43) lim n-" J (r, p, q, €1,°** > €m) = 0 


except perhaps when r; = --- = r, = 2. 
If 7, = --- = r, = 2, we have 
(44) n"J(r, P,Q, 1, °°* » Om) = O(nPte-e—™), 
By (38), re = 2 implies m = 1 or 2. If m = 2, then ex = ee = 1 and pm + 
q@ < 3 by (40). If px + q = 3,-the corresponding J-factor is of the form 


» d 2 d(i,j)d(i,k) or Dd 2 d(i, k)d(j, k), 


both of which vanish by (21). If m = 2 and px + q = 2, we have case (39) 
and hence, by the remark following (41), p + q — s — m < 0. By (44), this 
implies (43). 
Thus the only case where (43) need not hold is r, = 2, im = lfork = 1,---,s. 
Then p, = qe = 1, 1 = 2, hence 
r = 2s = 2m, p=q=r/2 


Q; = oes we, = 2, 


This proves relation (32), and (33) follows from 


‘ @ —r/2 r ‘ 
I(r, 2,+++,2)~n a(x, ~ y2,°"° 


ant Lior 
= Mi, = 





566 WASSILY HOEFFDING 


It remains to determine the number of terms I(r, 2, --- , 2) in (30) when r is 
even. This is the number of ways the subscripts 7, , --- , 7, can be tied in r/2 
groups of two, which is (r — 1) (r — 3) --- 3-1. The proof is complete. 


REFERENCES 

[1] H. E. DanigEts, ‘“‘The relation between measures of correlation in the universe of sample 
permutations,’’ Biometrika, Vol. 33 (1944), pp. 129-135. 

{2} H. Hore.uine anv M. Passt, ‘‘Rank correlation and tests of significance involving no 
assumption of normality,’’ Annals of Math. Stat., Vol. 7 (1936), pp. 29-43. 

[3] E. L. Leamann anv C. Stern, “On the theory of some nonparametric hypotheses,’’ 
Annals of Math. Stat., Vol. 20 (1949), pp. 28-45. 

[4] W. G. Mapow, “‘On the limiting distributions of estimates based on samples from finite 
universes,’’ Annals of Math. Stat., Vol. 19 (1948), pp. 535-545. 

[5] G. E. Norruer, ‘On a theorem by Wald and Wolfowitz,’’ Annals of Math. Stat., Vol. 
20 (1949), pp. 455-458. 

[6] A. Wap ann J. Wotrow!tTz, ‘‘Statistical tests based on permutations of the observa- 
tions,” Annals of Math. Stat., Vol. 15 (1944), pp. 358-372 





ON RATIOS OF CERTAIN ALGEBRAIC FORMS 
By Rosert V. Hoce 


State University of Iowa 


1. Introduction. In an investigation of the ratio of the mean square successive 
difference to the mean square difference in random samples from a normal uni- 
verse with mean zero, J. D. Williams [4] proved the rather surprising fact that 
any moment of this ratio is equal to the corresponding moment of the numerator 
divided by that of the denominator. Later Tjallings Koopmans [2] and John 
von Neumann [3] showed independently that this ratio and its denominator are 
stochastically independent. From this, Williams’ theorem is an immediate con- 
sequence. In this paper, we determine a necessary and sufficient condition for 
the stochastic independence of a ratio and its denominator. We then use this 
condition in our study of certain ratios of algebraic forms. 


2. Stochastic independence of a ratio and its denominator. We prove the 
following theorem for the continuous type distribution. Consider two one-di- 
mensional random variables z and y and their probability density function 
g(x,y). Let P(y S 0) = 0. Assume the moment generating function, M(u,t) = 
Efexp (ux + ty)], exists for —T < u,t < T, T > 0. The theorem is as follows. 

THEOREM 1. Under the conditions stated, in order that y and r = x/y be sto- 
chastically independent, it is necessary and sufficient that 


a*M (0, 0) 
a*M(0,t) _ dut—s 9 M(0, 2) 
~ ouk——s« FM(O,0) ®t 
Of 
for k = 0, 1, 2,---. 

Proor oF NeEcgssity. If f(r, y) is the probability density function of the 
variables r and y, it is well known that a necessary and sufficient condition for 
the independence of the random variables r and y is that f(r,y) = filr)fe(y), 
where fi(r) and fe(y) are the marginal density functions of r and y respectively. 
Hence, since x = ry, 


M (u,t) = Efexp (ury + ty)); 


M(u,) = [ | exp(ury + ty)filr)fe(y)dr dy. 


By hypothesis, the moments of x of order k exist; so 


k 7. 
d M (0, t) - / [ (ry)* exp (ty) filr)fely)dr dy. 


567 


ou 





568 ROBERT V. HOGG 
Finally, 


h ), 7x oa . 
an S = | fflrdar- | y exp (ty)fely)dy, 


ou® J 


we 


for k = 0, 1, 2, --- . If we set t = 0, we see that [ rfi(r)dr exists, since it 


ep 


is equal to the quotient of the kth moments of x and y, 


a*M (0, 0) 
~ Ouk 
a*M (0,0) * 
Ot 


K, = 


The hypothesis precludes the moments of y being zero. We also note that 
a°M (0, t). 
le 


[ y exp (ty)foly)dy = 


consequently 
aM (0, t) - IM(0, t) 
—- » ks ———— , 

ou or’ 
for & = ©, 1,2,.<-+.. 
PROOF OF SUFFICIENCY. Consider the identity 

a*M(0, t . dM(0, t) 
= »Y) — Ky stg ’ 


au*® ot 


a © ao aw ao 
(2.1) | | x exp (ty)g(z,y)dady = K, | y exp (ty)g(x,y)dx dy. 
J—av—cc — 1 J— of 
Since all the moments of x and y exist, we may —— ) times with respect 
I | 
to ¢t under the integral signs. Then if we set t = 


| | x*y’g(x,y)dxdy = Ke [ re “Po(x,y)dx dy, 


for p = 0,1, 2, --- . Although ¢ has been restricted to the range —T < t < T, 
we may extend that range to — « < t < T and still have the existence of M(u,t). 
The condition that P(y < 0) = 0 further permits us to integrate (2.1) p’ < k 
times under the integral signs as shown below. 


eI A® C ate 


L Ll ) ["- is [. x" exp (tr y)g(z, y) {I dt; dx dy 


— a — oo 


ate 


peo po pO tp» p’ 

= Ki ! | [ ii | y* exp (try)g(x,y) IT dt;dx dy, 
— 20 ¥— co ¥— 20 /— 00 —o j=l 

pO poo p? ae) 


(“) - oc?" o(x,y)de dy = K, i y* “?* o(x,y)dx dy, 


v—av—o 20 





RATIOS OF ALGEBRAIC FORMS 


for p’ = 1, 2,3, «++ , k. These two expressions may be written 
E(a'y") = K.E(y**") 


fork = 0, 1,2, +++ andm = —k, --+ , —1,0, 1,2, ---.Ifm = —k, then 


[Gl 
E(2*y") = E (2) | E(y**™), 
£[G)e-]-#[@)] 20> 


for k = 0, 1, 2,--+ and m = —k,--- , —1, 9, 1, 2, --+ . This could also be 
rewritten as 


E(r'y") = E(r’)-Ey'), 
fork = 0,1, 2,--- andh = 0, 1, 2, --- . This is sufficient to insure stochastic 


independence of r and y; thus the proof is complete. 


3. Ratios of linear forms in gamma variables. Let the independent random 
variables z; have the gamma density functions 


bl i a )4 ) 
fila) = P(e + Id (x;)*4 exp | — i)” zs < @, 


lo, elsewhere, 


where c; > —1 and d; > 0, for7 = 1, 2, --- , n. Construct the two real linear 
forms L; = >. ast; and L; = >> bjz;, b; > 0. Let L; and Ly be linearly inde- 
1 1 


pendent; thus their ratio will not be a mere constant. 
THEOREM 2. Under the conditions stated, a necessary and sufficient condition 
that L, and L/L be stochastically independent is that 


bid, = beds “™ «cc =& a 


Proor. Our proof consists in showing, by the use of Theorem 1, that if some 
of the bd values are distinct, the variance of L;/L. is equal to zero. This fact 
further implies that the ratio is a constant, and hence the necessity of the 
condition is proved by contradiction. For the sufficiency, we demonstrate that 
the partial derivatives of the moment generating function Elexp (ul, + tL.)| 
satisfy the condition of Theorem 1. However in interest of conservation of paper, 
a referee’ has suggested that upon setting uj = x,;/d; , von Neumann’s argument 
[3] may be made to complete the proof. 


1 We take this opportunity to thank the Referee for this and other suggestions. 








570 ROBERT V. HOGG 

























An interesting consequence of Theorem 2 is the following corollary. Let 
Q, = X’AX and Q, = X’BX be two real symmetric quadratic forms in n ran- 
dom values of a variable normally distributed with mean zero. We restrict Q2 
to be nonnegative (or nonpositive). Let AB = BA. It is known ((1], p. 25) that 
there then exists an orthogonal matrix C such that simultaneously C’AC and 
C’BC are diagonal matrices formed by the characteristic numbers a; of A and 
b; of B respectively. Let the rank of AB equal the rank of A. Thus if b; = 0, 
the corresponding a; = 0. Further let Q; and Q, be linearly independent. 

Coro.iary. If the above conditions are satisfied, a necessary and sufficient con- 
dition that Q2 and Q,/Q2 be stochastically independent is that B® = bB, where b is 
a real nonzero constant. 

This corollary is essentially the theorem suggested by von Neumann’s orig- 
inal argument. 






4. Ratios of linear forms. 
TueoreM 3. Let x have a continuous distribution such that m(t) = Efexp(tz)} 


exists for —T <t < 1, T > O. Let the real linear forms L, = >. a;x; and Lz = 
1 


> 2; , in n random values of x, be linearly independent. Provided P(x < 0) = 
1 


0([P(2 = 0) = O], a necessary and sufficient condition for L, and L/L» to be sto- 
chastically independent is that x [—2x] have a gamma distribution. 

PROOF OF SUFFICIENCY. We use Theorem 2. If x has a gamma distribution 
and the set 2, , %2,-°-- , 2, is a random sample, then d, = d, = --- = d,. We 
also note that b; = bo = --- = b, = 1. Hence bid; = bed. = --- = b,d,. This 
implies that Le and L,/Lz are stochastically independent. 

PROOF OF NECEssITy. Write 


M(u, t) 


Elexp(uL, + tL2)], 









= Il m(a;u + t). 
1 


Since the conditions of Theorem 1 are satisfied, the stochastic independence of 
Le and L/L. implies 





ak k 
aM(0,t) _ , a@M(O, t) aa - 
(4.1) ~~ = K, ary k = 0, 1, 2, --: 
Using this condition for k = 1 we find 
(4.2) 2 a; = nk. 
1 


For k = 2, (4.1) becomes 
(4.3) (x a) [m’’(t)}[m(t) |" + (2 z a; a) [m’(t)}*[m(t)]"” | 


= Kz {nlm’’(t)][m(t)]"" + n(n — 1)[m'(t)[m()]""}. | 






RATIOS OF ALGEBRAIC FORMS 


We now show that this identity implies that 
(4.4) [m” (t)][m(t)]"~* = efm’(t)}'fm@)]", 


where 


co = Jt (O)im(o)]"* 
[m’ (0) }*[m(0)}"-? ° 
To do this we assume (4.4) is not true. That is, we assume m”(t)[m(t)]"~ and 


[m’(t)|"[m(t}]"~ to be linearly independent. By considering the coefficients of 
the linearly independent functions in (4.3), we find 


n 
da; = nK¢e 


and 
2 >> asa; = n(n — 1)K2. 
i<j 


Adding these two equations we have 
(x «) = n’Ko. 
1 


This result with (4.2) implies that Ki = K: . However K, = E[L,/L2] and K, = 
E{(L,/L2)"|; so the variance of the ratio must equal zero. This requires the 
ratio to equal a constant; that is, K, = L,/Le. However this is contrary to 
the hypothesis that L, and L, be linearly independent. Thus (4.4) must be an 
identity. 

We have now found that the stochastic independence of L: and L,/L: imposes 
the restriction ; 


m”(t) m(t) = e[m’(t)) 


on the moment generating function of the distribution from which the samples 
are drawn. Since m(t) is a moment generating function, m(0) = 1, m’(0) = E(z), 
and m”(0) = E(z*). Moreover, with a continuous distribution, E(z*) > [E(zx)|* 
and hence c > 1. Accordingly, we can say that (4.1) for k = 1, 2 requires m(¢) 
to be the unique solution to the above differential equation with the given 
boundary condition m(0) = 1. That is, 


m(t) = (1— bt)", = e>1, 


where b is an arbitrary constant. Hence (4.1) for k = 1, 2 restricts us to moment 
generating functions of the gamma type. It might be urged that (4.1) fork = 
3, 4, 5 --- could further restrict our solution. But this can not be the case since 
we proved the sufficiency of the gamma distribution for the stochastic inde- 
pendence of Ly and L,/L,. That is, M(u,t) must satisfy (4.1) if m(t) = 
Efexp (tx)], where z has a gamma distribution. This completes the proof of the 
necessity of the condition. 





572 ROBERT V. HOGG 


The author wishes to express his appreciation to Professor A. T. Craig for 
the suggestions made during the preparation of this paper. 


REFERENCES 

[1] H. Werx, The Theory of Groups and Quantum Mechanics, Methuen and Co., Ltd., 
London, 1931 

[2] Tsatuinc Koopmans, “Serial correlation and quadratic forms in normal variables,” 
Annals of Math. Siat., Vol. 13 (1942), pp. 14-33 

[3] J. von NEUMANN, “‘Distribution of the ratio of the mean square successive difference to 
the variance,’’ Annals of Math. Stat., Vol. 12 (1941), pp. 367-395. 

[4] J. D. WriiraMs, “Moments of the ratio of the mean square successive difference to the 
mean square difference in samples from a normal universe,’’ Annals of Math. 
Stat., Vol. 12 (1941), pp. 239-241. 

[5] Ropert V. Hoaa, ‘‘On ratios of certain algebraic forms in statistics,’ 
thesis, State University of Iowa. 


, 


unpublished 





NORMAL REGRESSION THEORY IN THE PRESENCE OF 
INTRA-CLASS CORRELATION 


By Max Hatperin’ 
USAF School of Aviation M. edicine’ 


1. Summary. In this paper we prove that certain estimators and tests of sig- 
nificance used in regression analysis when observations are independent are 
equally valid in the presence of intra-class correlation. An application of this 
result is presented for the situation in which several replications of the correlated 
set of observations are available. As a special case of this application, it is shown 
that the usual test of “column effects” in the analysis of variance for a two-way 
classification remains valid when rows are independent and columns are uni- 
formly correlated. This latter fact is also pointed out in [3]. 


2. Introduction. In the usual treatment of regression theory, as in [1] (Chapters 
VIII and IX), it is assumed that we have a sample of n independent observations, 
Yi, °** ,» Yn, Where y, arises from a normal distribution with mean aa Sows 
and variance o°. Here, the z,. are taken to be fixed variates. On the basis of 
these assumptions, unbiased estimates of C,, C2, --- , Cy are obtained, and 
two theorems are proved, one concerning the joint distribution of the estimates 
of the C, and the sum of squares of deviations from regression, the other con- 
cerning tests of significance of the C, . 

Now, on the one hand, it may happen that the results given in [1] are applied 
when, unknown to the experimenter, the observations are actually correlated. 
On the other hand, it may be clear, a priori, that the observations are correlated 
and that estimates and tests of the C, are required in the light of the particular 
kind of correlation assumed to hold. In either case an investigation of estimates 
and distributions is called for. We consider these questions in Section 3 for the 
case that y,, --- , yn have a variance matrix 


1 


; p 
In Section 4 we consider an application of our result to several replications of 
the correlated set of observations. 


3. Estimates and significance tests in normal regression theory for correlated 
observations. We slightly modify the regression model indicated in Section 2 


1 Now at National Heart Institute, Bethesda, Md. 
* This paper represents the views of the author and not necessarily those of the Depart - 
ment of the Air Force. 


573 





574 MAX HALPERIN 
by supposing that the expected values of the y. are given by 


(3.1) Eye = w+ 2 Cplpa, = 1,2, -**,% 
p=1 

The reason for this modification will be apparent later. Assuming then that 

the ya have the covariance matrix, R,, the appropriate sample likelihood of 

Yi, °** » Yn iS readily seen to be 


> 2 
, ‘ In 1 + ) > 1 , , 
(3.2) P(yiy***, Yn) = —- exp — sly — Ey} Ra ty — Ey} 


(Qry)n/2 ’ 


where 


Y = (Wyte*, Yn); 


(3.21) ‘ . 
Ey = wlly+++, 1) + 2 Clap, ++, Lyne 
p=1 
The maximum likelihood equations for the estimation of parameters from 
(3.2) are of such a formidable character that an explicit solution does not ap- 
pear possible. As alternative estimates for u, C; , --- , Cx , one can use 


;= 7 Coe, 


p=1 


(3.3) 


L 
k 


iad » Sry Srp, 


r=1 


where 


n 


(3.31) § > (Sre i 


a=l 


and the S,, are elements of the inverse 


su 


(3.32) 


where 


(3.33) 8; = _ (tia — ¥;) (ja — Fj), 


a=1 


We go on now to investigate the distribution of Gi, ***, Gis when (3.2) 
holds. We have the following 


THEOREM A. Let y;, --: , Yn be a sample of one from a multivariate normal 
° . k ’ 
population with covariance matrix R, and means » + - we stsa; or = 1, <5 % 


hunt Pm 


Let estimates of u and the C, be fi and the C, as defined in (3.3). Then 





NORMAL REGRESSION THEORY 575 


(a) the (C, — C,) have a multivariate normal distribution with zero means and 
covariance matrix (1 — p)o’S”', and 

(b) the quantity >> 71(ye — A — DC tye) (= V) is distributed as (1 — p)a*x’ 
with (n — k — 1) degrees of freedom, and independently of the Cy . 

Proor. Conclusion (a) of the theorem follows readily from the fact that the 
C, are linear functions of variables obeying a multivariate normal law and from 
some simple calculations to verify that the C, are unbiased and have the indi- 
cated covariance matrix. The details are omitted. 

Now let 


h nl 


fis 
ior 
h 


be an n X n orthogonal matrix, and let 


zZ= 
« 


(3.5) 
W, = 
By this transformation (3.2) becomes 
me ee Geri - ai" et ai = P) ami ee 
1 ; __(@n — Ezn)? 
ov 2m {1 + (n — Wp}? PL 20°f1 + (n = Dp} 
while 


n 
7 Wia Wie — Win Win 


a= 


n—1 
a Wia Wja; 


a=] 


n—1 
Sry Za WraZa = 


a=} 


Applying the transformation (3.5) to the C, and V, it is easy to show that 


k 


C, ~ 8r2 Srpy 


rel 


n—1 k 2 
V z (2. _ > e. se) 7 


a= p=1 





576 MAX HALPERIN 


Since it can also be shown that 


k 
= >) Cpwe, a= 1,2,++-,n—1, 
p=1 
it is clear that the transformation (3.5) has reduced the problem to the standard 
one indicated in Section 2, with (n — 1) variables instead of n, and the theorem 
follows by the arguments given in [1]. 
TueoreM B. Let y; , --- , yn be as specified in Theorem A. Let ~ be the statistical 
eee that Crar = Critso, ***, Ce = Ceo, regardless of the values of 
Cro to, . When Ho is true, ‘the quantities 


v=> (y. -i-d Cya%0) 


a=l p=1 


k 


a= Dd bealC, — Cood(Cr — Cro) 


Ohmr+1 


are independently distributed as (1 — pox’ with (n — k — 1) and (k — r) degrees 
of freedom respectively. Here C, is defined by (3.3) and the b,, are defined by the 
matrix equation 


Depticet °° Ontrel [Srirepa °° 


(3.6) 


(n—k—1)q 
(k—r)V 


provides a test of Hy for1 > p > — 1/n — 1. 

Proor. It is clear that application of the transformation (3.5) will reduce the 
problem to that of proving the corresponding theorem in standard regression 
theory with a sample of (n — 1) independent observations. The theorem follows. 


F= 


4. An application. We suppose we have m replications of the correlated 
sample of Section 3, generalizing slightly by further assuming that yu differs 
from replication to replication, assuming the value r; for the 7th replication. 
Thus, if yia is the ath measurement in the ith replication, 


k 
(4.1) Eyia = 1:1 + D>, Cptpa, 


p=1 


= 9 
a=1,2,---, 


and we ask for estimates of the r; and C, , and tests of significance for the C, 





NORMAL REGRESSION THEORY 57 


It follows easily from Section 3 that unbiased estimates of the r; and the C, 
are given by 


k 
rT; « — C2, t=1,2,--- 
1 
(4.2) - 


Cy ry Srp, p= 1,2,--- 


where 
= a (tre port 2,) (9.0 —_ g..) ’ 


1 n 
i D ia. 


N aml 
m 
Z Yia, 
1 
n 


n 


and S,, is defined as in (3.33). = 
We now ask for the joint distribution of C,, --- , C, and 


(4.22) V - {te — 7. — a C, (tee * a}. 


t=1 a=] 


It follows as in Section 3 that C,, --- , C,, have a multivariate normal distribu- 


tion, and it is sufficient for our purposes to examine the joint distribution of V 
and W, where 


W=mC-C)SC-OCy, C-C=(G1-—G,---, OG — C). 
By an application of the transformation z; = y:L, to the n observations of 
each replication, one obtains 


TueoremM A’. Let ya, ++: , Yin(t = 1, 2, --- , m) be a sample of one from a 
multivariate normal population with means given by (4.1) and the mn XK mn variance 
matrix 


R 
lo 


QO -++ Ral 


Then (C,; — Ci), «++, (Cy — Cy), have a multivariate normal distribution with 
zero means and variance matrix [(1 — p)o’/m)S~’, and W and V are independent 
(1 — p)o’x’ variates with k and m(n — 1) — k degrees of freedom respectively. 





578 MAX HALPERIN 


We can also prove 
TuroreM B’. Let ya, ---, yin? = 1, 2, «++, m) satisfy the conditions of 
Theorem A’. Let Ho be as specified in Theorem B. Then the statistic 


[m(n — 1) — kl q, 


oon 


7 PM ben(C, ‘ied Coo) (Cr — Cro) 


Ohmr+l 


and by, is defined by the matrix equation 
aie mas bret} a mi °° 


| - | =m 
i, 


| 


| 
(Dk net a ) (Se. r+ 


is distributed as Snedecor’s F and provides a test of Hg . 


The proof of Theorem B’ is along the same lines as that of Theorem A’ and is 
omitted. 


We also remark that theorems akin to A’ and B’ hold if r; = r,i = 1,2, +--+ ,m. 
We simply may take 


x 
f=7..— 7 Cz. 
p=1 


The estimates of the C, need not be changed. If now we let 


k \2 
{ws =— §.. -> C, (xp; — Z,) > ’ 
2, j=l p=1 ) 


Theorems A’ and B’ hold with r; and 7; replaced by r and 7, with V replaced by 


V’ and m(n — 1) — k degrees of freedom replaced by nm — k — 1 degrees of 
freedom. 


As an example of the application of these notions we consider an analysis of 
variance problem. The same problem has been considered in [3]. Suppose we 
have mn observations, 


Yu [oo 


ae 


where the y;. are jointly normal with covariance matrix R,» and with means 
given by 


(4.3) Eyia = 7; a Ca . 





NORMAL REGRESSION THEORY 579 


In [3] it is shown that the F ratio for “columns” calculated in the usual way 
has the usual F distribution when the C; are equal. To deduce this test from 
our results we write (4.3) as 


(4.31) Eyia = 1% + > CyXpay 
p= 
where 
Tpe = 0, pHa, 
= |, p = a. 


We have then 


“ 3 (Gra — 8) (qe — 2) 


1 
“ir ?'@ 


n— 1 
oF ae 5 a ae 


n 


The n X n matrix, S, is singular. To overcome this difficulty we can, since we 
are only interested in class differences rather than in the absolute values of the 
C,, arbitrarily assign to one of the C,, say C,, the value zero. The test of 
column differences then becomes a test that C; = C, = --- = Cr = 0. It 
is then easy to see that C, = 9.» — Jn, p = 1,2, ---,n — 1, and 


t 7 1 Fe 
yon s p> pe 
p= 


If we substitute these values in g = mC S* C’ and 
m n n—l 
V= DD Wea — HD Corp)’, 
t=1 awl p=1 
where 


C = (Cy, +++, Cra) 
and S* is the minor of s,, in S, we find after a little algebraic reduction that 


(n= 1m — Nig _ MT Dim — DOs — 9-7 


es (n—1)V 


(n — 1) 2 2 (ys — Hs. — 9.5 + G..)” 
and this is the desired statistic. 


Suggestions of the referee for simplifying the proofs are gratefully acknowl- 
edged. 





MAX HALPERIN 


REFERENCES 


. 8. Witxs, Mathematical Statistics, Princeton University Press, 1946. 
{. Cram&r, Mathematical Methods of Statistics, Princeton University Press, 1946, pp. 
490-496. 

[3] D. F. Voraw, A. W. KrmBa.., AnD J. A. Rarrerty, “(Compound symmetry tests in the 
multivariate analysis of medical experiments,’”’ Biometrics, Vol. 6 (1950), pp. 
259-281. 

[4] J. E. Wausu, ‘‘Concerning the effect of intra-class correlation on certain significance 
tests,’’ Annals of Math. Stat., Vol. 18 (1947), pp. 88-96. 





MINIMUM VARIANCE ESTIMATION WITHOUT 
REGULARITY ASSUMPTIONS 


By Dovctias G. CHAPMAN! AND HERBERT ROBBINS 
University of Washington and University of North Carolina 


1. Summary and Introduction. Following the essential steps of the proof of 
the Cramér-Rao inequality [1, 2] but avoiding the need to transform coordinates 
or to differentiate under integral signs, a lower bound for the variance of estimators 
is obtained which is (a) free from regularity assumptions and (b) at least equal 
to and in some cases greater than that given by the Cramér-Rao inequality. 
The inequality of this paper might also be obtained from Barankin’s general 
result? [3]. Only the simplest case—that of unbiased estimation of a single real 
parameter—is considered here but the same idea can be applied to more general 
problems of estimation. 


2. Lower bound. Let u be a fixed measure on Euclidean n-space X and let 
the random vector x = (x,, --- , 2n) have a probability distribution which is 
absolutely continuous with respect to u, with density function f(z, a), where a 
is a real parameter belonging to some parameter set A. Define S(a) as follows: 


f(z, a) > 0, a.e. x in S(a), 
f(z, a) = 0, a.e. xin X — S(a). 


Let ¢ = t(x) be any unbiased estimator of a, so that for every a in A, 
(1) [ if(x, a) du = a. 


If a, a + h(h ¥ 0) are any two distinct values in A such that 
(2) S(a + h) C S(a), 
then, writing S for S(a), 


[ 2,0) wer [ fiz, a+ h) ds = [ saath) nok 
a 8(a+h) 8 


[ a, «) “ee « [ a2,0 +h) io as’ 
8 8 
so that 


[te — al Vice, a) e+ — fe a) 


iia, a) V f(z, a) du = 1. 


1 This research was supported in part by the Office of Naval Research. 
2? But again with some additional restrictions. 


581 





582 DOUGLAS G. CHAPMAN AND HERBERT ROBBINS 


Applying Schwarz’s inequality we obtain the relation 


r 2, ; i) f(z,a +h) — f(z, a) : 
ft — : Bi nartanenpapedensnnaieaiieinataas ° ( 
Is L u al fz, a) du L ! hf(x, a) f(z, @) du 
( 


. Seale 1{f[faath)? ee 
= Var(t a). wid, [et] f(a, a) du i. 


J = Jla, h) = + (eet Sy 


h? J 
then (3) can be written in the form 
1 
E(J | a)" 
Since (4) holds whenever a, a + A are any two distinct elements of A satisfying 
(2) we obtain the fundamental inequality 


(4) Var(t| a) > 


1 
inf B(VJ | a)’ 
h 


(5) Var(t | a) > 


where the infimum is taken over all h # 0 such that (2) is satisfied. It should 
be noted that (5) holds without any restriction on f(x, a) and without any 
restriction on ¢ other tnan (1). 

It is possible that E(J | a) does not exist (finitely) for any h. With the usual 
convention that E(J|a) = ©, in this case, (5) is still a valid, though trivial, 
inequality. 

In applications » will often be Lebesgue measure on X. It could equally well 
be a discrete measure on a countable set of points in X. Furthermore, if the set 
where f(z, a) > 0 is independent of a then (2) is trivially satisfied for all a + h 
in A. 

We shall have occasion to compare (5) with the Cramér-Rao inequality 
(6) Var(t!a) > ula’ y = Via) = = In f(x, a). 

E(¥* | a) 0a 
This inequality is usually derived for distributions with range independent of 
the parameter and under certain regularity conditions on both f(r, a) and the 
unbiased estimator ¢. 


3. Examples. 


Example 1. Unbiased estimation of the mean of a normal distribution based on a 
random sample of size n. Here 


f(z, a) = (Qe) ge" 262)5"_, 5-2)? 


where o is a positive constant, and 


} n 
ljo*+) 


ié int 


2 ((zg;—a—h)2@—(z3—a)?) 50 n fg? the 
= “san 


ak 





MINIMUM VARIANCE ESTIMATION 


where we have set u = Dokalz — a)/(avV/n), k = hvV/n/o # 0. 

When the mean is a, u is normally distributed with mean 0 and variance 1, and 
we find after a simple computation that 

E(J | a) = ald” ~ 1)/(o°k’), 


(7) ‘ a : 
inf EJ | a) = lim [n(e” — 1)/(e°k*)] = n/o’ = [E(y’ | @)). 
A k—-0 


Hence if ¢ is any unbiased estimator of a it follows from (5) that 
(8) Var(t|a) > o°/n. 


Since the sample mean Z is an unbiased estimator of a with Var(z | a) = o’/n, 
it follows that ¢ has minimum variance in the class of all unbiased estimators 
of a. 

In this example the Cramér-Rao inequality (6) yields precisely the same 
bound (8). 

Corresponding results hold for the unbiased estimation of the variance when 
the mean is known. Both (5) and (6) yield the inequality 


Var(t | a) > 2a°/n, 


where a is the unknown variance. The equality sign holds for 


t=n" . (x; — m)’, 
i=l 


where m is the mean of the normal population. 
Example 2. Unbiased estimation of the standard deviation of a normal population 
with known mean. Here 


f(z, a) a (Dap) gg 20h ng (2s)? 
Setting k = h/a we find that for —1 < k < 7/2 —1,k € 0, 
(9) EJ \a) = {+k "tl — 2+ ky — 1)/(e'k’). 


In this case also, limyioE(J|a) = 2n/a® = E(y’| «). But the minimum value 
of E(J | a) is not approached in the neighborhood of h = k = 0, and the in- 
equality (5) is sharper than (6). We shall ‘consider only the case n = 2. Equation 
(9) then becomes 


E(J | a) = (p + 1)°/[a’p*(2 — p*)], 


where we have set p = 1 + k and 0 < p < 7/2. We have for p = 
£8393, 1/E(J | a) = .2698 a’, so that by (5) 


1 


r > .2698 a’ > .25a° = ———_—.. 
Var (t| a) > .2698 a > .25a EW la) 





e 


584 DOUGLAS G. CHAPMAN AND HERBERT ROBBINS 


It is interesting to note that the unbiased estimator 


a i: tages Vx (x; ta m)* / 0 


k(n + 1)] iat 


° lin 

271 2 

a | 3 eee 
ls Mm3(n + 1)] |: 


a |! — | = .2732a’. 
rs 


But it can be shown using results of Lehmann and Scheffé [4], or of Hoel [5], 
which were derived from Blackwell’s theorem on conditional expectation [6], 
that no other unbiased estimator can have smaller variance than t. Thus (5) 
does not give the greatest lower bound in this case. 

Various examples of the application of (5) can be given where S(qa) is not a 
constant and where the Cramér-Rao formula is invalid (see for example Cramér 
[1], p. 485). It should be noted, however, that in many of the standard problems 
of this type stronger results can be obtained by other methods. 

Another class of estimation problems where (5) may be applied occurs if the 
parameter space is discrete. Again in this case the Cramér-Rao formula does 
not hold. An example of this type has been given by Chapman (([7], pp. 149-150). 
Other applications of this type and some results related to this paper were 
obtained recently by Hammersley [8]. 


has variance 


which for n = 2 becomes 


4. General comparison with the Cramér-Rao inequality. Let 


at ee _ [flz,a + h) — f(z, oy; 
(10) J = J (a, h) = [Meet > 


then 
E(J | a) = E(J | a). 


Hence in the fundamental inequality (5) we can replace J by J. But from (10) 
it is clear that 


lim J(a,h) = [2 In f(x, a) | = ¥(a) 
da 


hd 


whenever the latter exists. 

Assuming now the usual regularity conditions under which the Cramér-Rao 
lower bound is derived, that S(a) is independent of a and that f(x, «) is suffi- 
ciently regular that we may pass to the limit inside the integral sign, 

(11) E(y’! a) = Eflim (7 | a) | = lim E(J| a) > inf E(J | a) = inf E(J| a), 
h—-0 h 


h—0 h 





MINIMUM VARIANCE ESTIMATION 585 


the infimum being taken over admissible values of h. It follows that the ine- 
quality (5) is at least as sharp as that given by the Cramér-Rao formula (6). 
On the other hand, when z = (4, °*- , 2,) isa random sample from a regular 


distribution, and when E(y’ | a) < @, hen for any fixed h ~ 0, there exists 
an nm such that for n > 


(12) EW | a) < E(J | a). 


Without loss of generality assume E(J | a) < o. Letting g(t, a) denote the 
density function of a single x; and v the one-dimensional measure which generates 
u, it is easily verified that 


E(J | a) = u(| (eeet™ w|’ ~ 1). 


By hypothesis, except on a set of measure 0, 


g(t, a+ h) = glt,a) + aoe ; asalh)cath. 


 |rma(h) 
Hence 


Ghat, 149 “(2 ) 
(t, a) em 2 a=a(h) wes ‘Lo da lama) ™ 


Denoting the last integral of the right hand side of (13) by R(a, ~) and noting 
that the relation 


(13) 


ic ae= i 


may be differentiated under the integral sign so that the middle term vanishes, 
it follows that 


[1 + WRC, A)]" — 1 
h2 


On the other hand, from (11) and (14), 
(15) E(y’ | a) = nR(aq, 0). 


In order that different parameters may be distinguishable we must have 


(14) EW |a)= > nR(w, h) + 4n(n — 1)h’ R*(a, h). 


og 
0a a=a(h) 


#0 


for a set of positive measure on the t-axis, and hence R(a, h) > 0. From this and 
the fact that R(a, 0) is independent of n, (12) follows at once, for sufficiently 
large n, from (14) and (15). 


REFERENCES 


[1] H. Cramtr, Mathematical Methods of Statistics, Princeton University Press, 1946. 
[2] C. R. Rao, “Information and the accuracy attainable in the estimation of statistical 
parameters,’’ Bull. Calcutta Math. Soc., Vol. 37 (1945), pp. 81-91. 





586 DOUGLAS G. CHAPMAN AND HERBERT ROBBINS 


{3] E. W. BarankIn, ‘“‘Locally best unbiased estimates,’’ Annals of Math. Stat., Vol. 20 
(1949), pp. 477-501. (More complete references to the general problem are given 
in this paper.) 

{4] E. L. Leamann anv H. Scuerrf, ‘“‘Completeness, similar regions and unbiased esti- 
mation, Part 1,’’ Sankhyd, Vol. 10 (1950), pp. 305-340. 

[5] P. G. Hogt, ‘“‘Conditional expectation and the efficiency of estimates,’’ Annals of Math. 
Stat., Vol. 22 (1951), pp. 299-301. 

[6] D. BuackweE tL, ‘‘Conditional expectation and unbiased sequential estimation,’”’ An- 
nals of Math. Stat., Vol. 18 (1947), pp. 105-110. 

[7] D. G. Cuapman, ‘Some properties of the hypergeometric distribution with applications 
to zoological sample censuses,’’ Univ. of California Publ. Statist., Vol. 1, No. 
7 (1951), pp. 131-160. 

[8] J. M. Hammerstey, “On estimating restricted parameters,’’ Jour. Roy. Stat. Soc., 
Ser. B, Vol. 12 (1950), pp. 192-229. 





NOTES 


A GENERAL CONCEPT OF UNBIASEDNESS 
By E. L. LeHmMann 


University of California, Berkeley, and Princeton University 


The term unbiasedness was introduced by Neyman and Pearson [1] in con- 
nection with hypothesis testing. A test of the hypothesis 6 ¢ w against the alter- 
natives @ ¢ 2 — w is said to be unbiased at level a if its power function 8 satisfies 


B(0) < a for 6 € w, 
B(0) > afor@eEQ — w. 


In 1937 Neyman [2] developed a theory of estimation by confidence sets. 
He established a duality with the theory of hypothesis testing, so that to each 
notion of one theory corresponds an analogous one in the other. In particular, 
he defined a family of confidence sets A(x) to be unbiased if 


(2) P,(A(X) > 6) < Pe A(X) D> 8) for all 6 and @. 


While the above two definitions are closely related, a third use of the term 
unbiasedness was made in a rather different context. In presenting their version 
of the Gauss-Markov theorem on least squares David and Neyman [3] defined a 
point estimate 6(X) of g(@) to be unbiased if its expectation coincides with the 
estimated value, that is, if 


(3) E@(X) = g(8). 


It was pointed out later by Brown [4] that one obtains other analogous definitions 
by postulating that some central value of the distribution of 6(X) other than 
the mean coincides with the estimated value. Using the median as an example he 
defined 5(X) to be median-unbiased if 


(4) P(5(X) > g(0)) = Po(S(X) < g(@)) for all @. 


In view of Wald’s theory of decision functions [5] it seems tempting to try 
to give a definition of unbiasedness at the level of generality of this theory. 
Suppose we are concerned with a decision problem where the loss resulting 
from a decision 6(X) is W(@, 5(X)) when the true parameter value is 6. In analogy 
with (2) we shall say that a decision procedure 5(X) is unbiased if for each 6 


(5) E,W (6’, 6(X)) = min when & = @. 


This clearly reduces to Neyman’s definition for confidence sets if one uses for 
loss function, 


(1) 


, 0 if the confidence set 5(x) covers 6, 
(6) W(6, 6(z)) = : 
1 otherwise. 


587 





88 E. L. LEHMANN 


In order to obtain an interpretation of condition (5), let us consider the case 
that for each parameter value @ there exists a unique “correct” decision d and 
that each d is correct for at least some @. This is the case for example in hypothesis 
testing and in point estimation. Here a correct decision may be defined by the 
condition W(6,d) = 0. Let us say that two parameter values 6, 6’ are equivalent, 
6 ~ 6’, if the correct decision is the same for both of them, and let us suppose 
that for any decision d’ 


(7) W(6,, da’) = W(6, da’) whenever 6; ~ 6. 


Then the loss W(@, d’) depends only on the actual decision taken, say d’, 
and the decision d that would have been correct, and we may write for it W(d, d’). 
The loss W(d, d’) is a measure of how far the two decisions d and d’ are apart, 
and (5) states that a decision function 6(X) is unbiased if on the average it 
comes closer to the correct decision than to any incorrect decision. 

Let us now apply this notion to some particular examples. Let the decision 
to accept and reject the hypothesis H:6 ¢ w be denoted by dy and d, , respectively. 
Since in the Neyman-Pearson theory of hypothesis testing one is concerned only 
with the probabilities of the two types of error, the natural associated loss 
function is of the form 

laif @e2Q—w, 
W(0, do) = - 
(0 if 6 € w; 


b if @€a, 
0if@eQ—aw. 


It is easy to see that in this case (5) becomes 


W (6, di) = 


a 
._— 
as P4(d;) “ b for 6 € w, 


Po(d,) 


S..4 
— fordeQ — 
a+b , _ 
where P,(d) denotes the probability that 6(X) = d when @ is the true parameter 
value. This is exactly the usual definition (1) with a = a/(a + b). 
Let us next consider point estimation where the loss is taken as the square 
of the error. If the function to be estimated is g(@), condition (5) becomes 


(10) E,5(X) — g(6’) > Eils(X) — g(0)f for all 8, 6’. 

Let E,5(X) = h(6). In the usual case that h(@) is one of the possible values of 
the function g, the left-hand side of (10) is minimized for g(6’) = h(@). Thus 
the inequality holds for all @ if and only if g(@) = h(6), which is equivalent to 
(3). So again (5) reduces just to the usual definition. 


Even if h(@) is not one of the possible values of g, it is easily seen that (10) 
is equivalent to 


| h(6) — g(@) | = min | h(6) — g(6’) |. 
er 





GENERAL UNBIASEDNESS 589 


Then, if for example © is a real interval and g is continuous and strictly monotone, 
there can exist at most two values of 6 for which g(@) # h(@). If further, as is 
usually the case, h(@) is continuous for all estimates 5, we must have h(@) = g(@). 

Quite analogously one sees that if W(@, 5(z)) = | 6(x) — g(@) | , definition (5) 
reduces to Brown’s notion of median-unbiasedness. 

While the definition given here seems satisfactory in that it does reduce 
under reasonable assumptions to the usual concepts, it is somewhat more re- 
strictive than appears at first sight. If for example there exists for each @ a unique 
correct decision d and if the loss function is of the form 


W(6, a’) = f(6)VGd, @’), 


then, with the trivial exception of procedures for which E,V(d, 5(x)) = 0 for 
some d and some value of @ in w, , no unbiased procedure can exist unless f(@) 
is constant on each wa. For let 6, 6’ € w,. On substituting in (5) we see that 
unbiasedness implies f(@’) > f(@) and hence by symmetry f(6’) = f(6). In hy- 
pothesis testing for example if the loss is zero for a correct decision, it follows, 
again with trivial exceptions, that unbiased tests can exist only if the loss func- 
tion is given by (8). 

It is perhaps worth pointing out certain connections between the principle 
of unbiasedness and that of invariance. Consider for example the problem of 
estimating 6 from a sample X,, --- , X, where the X’s are uniformly distributed 
on (0, 6). If one takes as loss function 


(11) W(0, 8(t1, ++, 2n)) = [5(ar, +++, tn) — 0/0 


the problem transforms in an obvious manner under a change of scale, and 
one may wish to consider only estimates having the invariance property 


(12) 5(cX1, °** ,CXa) = cB(X,, --- , Xz) forall ec > 0. 


If Y = max (Xi, «++ , X,), it is easily seen that among all invariant estimates 
the one that uniformly minimizes the expected loss is 


n+2 
n+1— 
This estimate does not have the usual unbiasedness property since 


n+ 2 _ n(n + 2) 
- [* i r| ~ (n+ iP 


(13) 


However a simple computation shows that (13) is unbiased in the sense of (5) 
with respect to the invariant loss function (11). 

More generally, let G be a group of measurable 1:1 transformations on the 
sample space. Let gX be the random variable that takes on the value gz when 
X = z, and suppose that when X has a distribution py, @ ¢ 2, then gX has a 
distribution pe , 6’ ¢ 2. Denote this 6’ by g@ and suppose that 96 defines a 1:1 
transformation on 2. Let G be the group of transformations 9 and assume that 





590 E. L. LEHMANN 


there exists a group G* of 1:1 transformations on the decision space D such that 
G* is homomorphic to G and 


(14) W (90, g*d) = W(0, d) for all @e Q,de D. 
Then a decision function 6 is said to be invariant if 
(15) 5(gX) = g*i(X). 


This is a natural generalization of the definition of invariance given by Hunt 
and Stein [6, 7], and is essentially the definition used by Peisakoff [8]. Further, 
6 is said to be almost invariant if (15) holds except on a set N, of measure 0. 

Whenever among all unbiased procedures there exists a unique’ one that 
uniformly minimizes the risk, then it is almost invariant. This follows easily 
from the fact that if 5(X) is unbiased g*5(g™'X) is also unbiased. It is not in 
general true that conversely an optimum invariant test is necessarily unbiased. 
However, this result does hold under certain restrictions.” If 

(i) G is transitive, i.e., given any @, 6’ there exists 9 such that 6 = g@’, 

(ii) G* is commutative, 
and if among all invariant (or almost invariant) procedures there exists one 
that uniformly minimizes the risk, then it is unbiased. 

To see this, let 6 be invariant and such that for any other invariant pro- 
cedure 6’ 


EW (0, 5'(X)) > EgW(@, 5(X)). 
Let 0’ + 0,0 = 90’, say. Then 
E,W (0’, 5(X)) = EysW(6, g*6(X)) > E,W (8, 6(X)). 


Here the inequality follows since by (ii) the invariance of 6(X) implies that 
g*5(X) is also invariant. 

While assumptions (i) and (ii) are satisfied in many estimation problems, (i) 
will in general not hold in a problem of hypothesis testing because of the asym- 
metry of dy and d,; . Here the result in question follows when the loss function is 
given by (8) from the fact that if a test is unbiased so is any test that is uni- 
formly better, together with the unbiasedness and invariance of the test g(x) = 
a/(a + b) (i.e., the test that rejects the hypothesis with probability a/(a + b) 
regardless of the observations). 

That the result is not true in general if we drop either one of the two condi- 
tions (i) or (ii) can be seen from the following example. For estimating the 
mean £ of a normal variable with unknown variance o° when the loss function 
is [(6(xz) — £)/o]’, the best invariant estimate is X both with respect to the 
group 


Giigr = x +b, —-x<b<a 


‘ Throughout, this is understood to mean unique up to a set of measure zero. 


2 I am grateful to the referee for pointing out an error in my original statement of this 
result. 





GENERAL UNBIASEDNESS 


and with respect to 
Goi:gx = ax + b, 0O<a<cw,-x2 <d< ow, 


For this problem an unbiased estimate in the sense of (5) does not exist, and 
it is seen that G, satisfies (ii) but not (i) while G2 satisfies (i) but not (ii). 

The notion of unbiasedness in many cases leads to reasonable decision pro- 
cedures and this seems to be in general the value of such concepts. On the other 
hand there is no guarantee that an optimum unbiased procedure is necessarily 
satisfactory. As an example (for another example see [9]) consider a Poisson 
variable X which is observed only if X # 0, so that the distribution of X is 
given by 
(15 wu ee: * —\ —A\-1 ms 9 

5) P(X=Ky=7FeU-ey, K = 1,2, --- 
It is desired to estimate the probability e~* of X being zero, and the loss func- 
tion is squared error. The condition of unbiasedness gives 


> r* a. F xii \* 
(16) > (K) = =1—e* = Vi(-1)*" 2, 
K=1 K . K=1 K ‘ 
so that 6(K) = (—1)**’. Thus the estimate takes on only impossible values 
and instead of decreasing with K as one would expect, it does not depend on 
the order of magnitude of K at all. 

As a final remark we mention, without going into details, the following ex- 
tension of the notion of unbiasedness. Instead of comparing E,W (@’, 6(X)) only 
with E,W (6, 5(X)) we may ask that E,W(@’, 5(X)) be a nondecreasing function 
of v(@, 6’), where v(@, 6’) in some sense measures the distance between @ and 6”. 
This notion is a generalization of one used by P. L. Hsu [10] in the theory of 
hypothesis testing. It is also closely connected with the principle of invariance. 
In fact if there exists a group of transformations leaving the problem invariant 
then with a suitable definition of v(6, 6’) it is easy to see under weak assumptions 
on the loss function that Theorem 7.1 of [7] generalizes to the present case. 
This theorem states essentially that the totality of procedures for which 
E,W (6’, 6(X)) depends only on v(@, 6’) coincides with the totality of invariant 
procedures. 


REFERENCES 

[1] J. NEyMAN anp E. S. Pearson, ‘‘Contributions to the theory of testing statistical 
hypotheses. I. Unbiased critical regions of type A and type A,,’”’ Stat. Res. Mem., 
Vol. 1 (1936), pp. 1-37. 

[2] J. Neyman, ‘‘Outline of a theory of statistical estimation based on the classical theory 
of probability,’”’ Phil. Trans. Roy. Soc. London, Series A, Vol. 236 (1937), pp 
333-380. 

[3] F. N. Davip anp J. NeyMan, ‘“‘Extension of the Markoff theorem on least squares,’’ 
Stat. Res. Mem., Vol. 2 (1938), pp. 105-116. 

[4] G. W. Brown, “On small sample estimation,’’ Annals of Math. Stat., Vol. 18 (1947), 
pp. 582-585. 





592 Z. W. BIRNBAUM AND FRE H. TINGEY 


(5) A Waxp, Statistical Decision Functions, John Wiley and Sons, 1950. 

(6] G. Hunt ann C. Stern, ‘‘Most stringent tests of statistical hypotheses,’’ unpublished. 

[7] E. L. Lenmann, “Some principles of the theory of testing hypotheses,’’ Annals of 
Math. Stat., Vol. 21 (1950), pp. 1-26. 

[8] M. Petsaxorr, ‘‘Transformation parameters,’’ unpublished thesis, Princeton Uni- 
versity, 1950. 

{9} P. R. Hatnos, ‘‘The theory of unbiased estimation,’’ Annals of Math. Stat., Vol. 17 
(1946), pp. 34-43. 

10) P. L. Hsu, ‘Analysis of variance from the power function standpoint,’’ Biometrika, 
Vol. 32 (1941), pp. 62-69. 


(a a 


ONE-SIDED CONFIDENCE CONTOURS FOR PROBABILITY 
DISTRIBUTION FUNCTIONS! 


By Z. W. Brrnspaum AND Frep H. Trincey’ 
University of Washington 

Summary. Let F(x) be the continuous distribution function of a random 
variable X, and F,(x) the empirical distribution function determined by a 
sample X,, X.,--- , X,. It is well known that the probability P,(e) of F(z) 
being everywhere majorized by F(x) + € is independent of F(x). The present 
paper contains the derivation of an explicit expression for P,(e), and a tabula- 
tion of the 10%, 5%, 1%, and 0.1% points of P,(e) for n = 5, 8, 10, 20, 40, 
50. For n = 50 these values agree closely with those obtained from an asymptotic 
expression due to N. Smirnov. 

1. Introduction. Let XY be a random variable with the continuous probability 
distribution function F(z) = Prob. {|X < 2}. An ordered sample X; < X;: 
<.--- < X, of X determines the empirical distribution function 


0 forx < &X, 


= for Xy SE < Aas, k = 1,2, ---,n—1, 


| 
F,,(z) ~“ 
\1 for X, < 2. 


The function 
Fy..(x) = min [F,(x) + «, 1], 


also determined by the sample, will be called an upper confidence contour. It 
is well known [2] that the probability 


P,(€) = Prob. {F(x) < F;,,(x) for all z} 


of F(x) being everywhere majorized by F7,.(x) is independent of the distribution 
F(x). An expression for P,(¢€) in determinant form was given by A. Wald and 


1 Presented to the American Mathematical Society on April 28, 1951 
2 Research under the sponsorship of the Office of Naval Research. 





CONFIDENCE CONTOURS FOR DISTRIBUTION FUNCTIONS 


J. Wolfowitz [2]. N. Smirnov [1] obtained the asymptotic expression 
2 ' 
(1.1) lim P, (Sz) sil-o™. 
The present paper contains the derivation of an explicit expression for P,(e), 
and a tabulation of values e,,,. such that 
(1.2) P,(€n,a) = 1 2 


for a = .10, .05, .01, .001, and n = 5, 8, 10, 20, 40, 50. For n = 50 these values 
agree very closely with those obtained from Smirnov’s asymptotic expression 
(1.1). 


2. Two integral formulae. For any integer k, 1 < k < n, we have 
1 


1 1 n—k+1 
: (1 — Xz-1) 

9 = eee eee = ee 
(2.1) fina (Xe) oni i. [aX AX ey AX. (n—k+ 1)! 


This formula is well known and may be obtained by an easy induction. 
For any integer k > 0 we have 


« (i/n)+e (kin)+e k 
k+1 
22 [/ | d --- dXed = + ( ttt). 
Oe lo dx, Xk Xess wa Oe (kK + 1)! om n 
To prove (2.2) one shows by induction that the left-hand expression is equal to 


€ eo ati=i)" 41 
areas NOE Oe 


which is equal to the right-hand term in view of the identity 
m+2 “\mt+1 
oT s+3-i) jot 
deel —1)** = 0, 
2X ( J siiel n 2 


3. An expression for P,(€). 
THEOREM. For 0 < e < 1 we have 


[n(1—e)] n j n-j j scl 
(3.0) P.(0=1-e > (")(a-«-2) (<+2) : 
j=0 J n n 


where [n(1 — €)] = greatest integer contained in n(1 — e). 
ProoF. Since P,(¢€) does not depend on F(x), we will assume that X has the 
probability distribution function 


(0 for x < 0, 
F(x) = <x for0 < x <1, 
\1 for 1 < z. 
For this random variable, P,(¢) is the probability that the ordered sample 
(3.1) O< Xi.5 X:5--- SX. <1 
falls into the region 
Xja SX; <si— +e 
(3.2) 
< X; 





594 Z. W. BIRNBAUM AND FRED H. TINGEY 


where Xo) = 0 and K = [n(1 — e)]. Since the probability density of an ordered 
sample (X; , X2, --- , X,) is equal to n! in the region (3.1) and to zero elsewhere, 
the probability of (3.2) is equal to 


(3.3) P,(e) = nlJ(e, n, K), 


where 
e (1/n)+e (2/n)+e (K/n)+e 1 1 
J(¢, 7 K) > I I I = I l | 
0 Xi Xe XK Xn41 9X 


K+? 


(3.4) . 
sh [ aX, +++ dXnsa dX nce Xan: --- dX, dX aN). 
x 


n-l 


By (2.1) we see that 


€ (l/n)+e (2/n)+e (kin)+e i n—k—1 
J(e,n, k) = [ [ [ aaa [ = Xow) ay, 
(3.5) 0 /x Xe x (n — k — 1)! 


1 & 
° dX; dX> ax, ° 


We will prove by induction 


n—k—! 
J(e,n,k +1) = Jlen,k) - “(a4 )(a-.- E+!) 


k+1 
% k 
(« zkt ‘y, 
n 
for any integer 0 < k < n — 1. Fork = 0, (3.6) can be verified directly. As- 
suming (3.6) for k < m, we obtain 
J( en, m+ 1) 
(i/n)+e p(m/n)+e p((m+1)/n'+e/ 
-[ [ | (1 = Xue)" ax, 4edXass >>> €XedX, 
X1 ‘ 


Xm+t (n — a = 2)! 


(1/n)+e (m/n)+e a r—m—1 
-f eo - (1 = Xm)" AX mii +++ AX2dX, 
X1 Xm 


(n — m — 1)! 


(3.6) 


( m + =) a 

l—e- (1/n)+e (m/n)+e 

a, ae f [ | eer 
X1 


eo 


Xm 


and, by the assumption of induction and (2.2), this is 


n—m—1 m 
J(e,n,m) — 12 (a - e— =+1) («+ = + ') : 


which proves (3.6). 


Noting that J(«, n, 0) = 2 {1 — (1 — ©)"], one obtains from (3.6) 





CONFIDENCE CONTOURS FOR DISTRIBUTION FUNCTIONS 595 


k «\ nj -\ jel 
ies Sr ae S > (") € —« - i) “(«+4) : 
n. NM. jet \J n n 


This, together with (3.3) completes the proof of (3.0). 

Remark. Setting F>..(x) = max [F,(x) — «, 0], one easily verifies that Prob. 
{F(x) > F%,.(x) for all x} is equal to P,(€), and hence also is given by (3.0). 

4. Tabulation of «,,. and comparison with asymptotic values. Table 1 con- 
tains numerical solutions ¢€,,. of equation (1.2), computed to a number of 
digits sufficient to assure that | P,.(é,,4) — (l — a)| < 5-107”. 

TABLE 1.8 
Solutions ¢,,. of equation (1.2) 


-100 050 010 001 
.4470 . 5094 .6271 . 7480 
.3983 . 4096 .0065 .6130 
.3226 . 3687 -4566 .5550 
.23155 . 26473 3285 .4018 
. 16547 . 18913 . 2350 2877 
. 14840 - 16959 -2107 .2581 


Setting z/+/n = én,2 in (1.1), one obtains for large n the asymptotic values 


(4.1) ine = | % log -. 


These values are presented in Table 2. 


TABLE 2 


pone 
fi / 1 
Values of &&a = VV on log: 


1 


x 


100 .050 010 


.4799 .9473 .6786 .8311 
.3194 -4327 .5365 .6571 
.3393 . 3870 .4799 5877 
. 2399 .2737 .3393 .4156 
. 1697 1935 . 2399 . 2938 
-1517 1731 .2146 . 2628 


A comparison of the two tables indicates that, for the probability levels 
001 < a@ < .1, the asymptotic values é,., are greater than the “exact” values 

3’ The authors wish to express their appreciation to the National Bureau of Standards, 
Institute for Numerical Analysis, for performing the computations which are summarized 
in this table. 





596 G. E. ALBERT AND RALPH B. JOHNSON 


€,,2 80 that the error committed by using é,,2 instead of €,,. would be in the 
safe direction, and that this error becomes already very small for n = 50. 


REFERENCES 
[1] N. Smirnov, ‘“‘Sur les écarts de la courbe de distribution empirique,’’ Rec. Math. (Mat. 
Sbornik), N.S. Vol. 6 (48) (1939), pp. 3-26. 
2} A. WALD anp J. Wo.trowrtTz, ‘‘Confidence limits for continuous distribution functions,’’ 
Annals of Math. Stat., Vol. 10 (1939), pp. 105-118. 


Oo eR 


ON THE ESTIMATION OF CENTRAL INTERVALS WHICH CONTAIN 
ASSIGNED PROPORTIONS OF A NORMAL UNIVARIATE POPULATION 


By G. E. ALBerT AND Ratpu B. JOHNSON 


University of Tennessee and Clemson Agricultural College 


Summary. For samples of any given size N > 2 from a normal population, 
Wilks [1] has shown how to choose the parameter A, so that the expected cover- 
age of the interval Z + A,s will be 1 — p. The present paper treats the choice 
of the minimal sample size N necessary to effect a certain type of statistical 
control on the fluctuation of that coverage about its expected value; a brief 
table of such minimal sample sizes is given. 

1. Introduction. Let F (7) denote the normai cumulative distribution function 


; ] . —(u—m)2/ (202) 
(1) F(y) = is e ak 
OV 27 Jw 
If p is any number in the range 0 < p < 1, factors A(p) are well known such 
that the proportion 


(2) A = F(m + Xo) — F(m — do) 
of the probability between m + do will equal 1 — p. 

If m and o are unknown, it is natural to consider the random variable 
(3) A(j, 8;\) = FG + As) — FY — As), 


a 


N N 
where 7 = >. yx/N and s =< > (ys — 9)?/(N — 1) 
t=1 


n=l 


Obviously \ cannot be chosen to guarantee A(g, s; 4) = 1 — p. 8.8. Wilks 
[1] has shown that, for a random sample of size N, the expectation of (3) is 
l — p, 


(4) EA(j, 8;4) = 1 — p, 


if the parameter \ is chosen as 


(5) Aj=t, ft. 





ESTIMATION OF CENTRAL INTERVALS 597 


In (5) ¢, is such that for Student’s ¢-distribution of N — 1 degrees of freedom 
Pr{|\t| > t,| = p. 

Wilks’ study of the variability of A(g, s; \) was based upon an approximate 
consideration of the variance of A. It is the purpose of this paper to present 
more precise results in this latter connection. 

Let d,, d, and a@ be assigned positive numbers satisfying the inequalities 
0O<1l—-p-—d<1—p+d< 1,and0 < a@ < 1. It is shown that if \ be 
chosen as in (5), the requirement 
(6) Prl —-p—d < AVG, 38;\) <1 —pt+ad|>a 
places a lower bound on the sample size N. It is clear that if d; and d. are small 


and a@ near unity, (6) places a control on the variability of A(j, s; \) about its 
expectation 1 — p. 


TABLE I 


05 
95 


dy 


075) : 24 49/ 54 128 226 
05 |. -—| — 48 9] 76 174 28 
025 | 025. - -| 65 159 299! 298 692 1194] 245 567 975 
035.015 | — —! 107 274 510] 420 1332 2628 | 337 1079 2184 
05 |. 27' 196 640 1230} 813 2991 5983 | 649 2488 4928 
.025 | .01 | 26 64] 226 641 1230 | 907 2993 5983 | 725 2487 4928 
02 .01 37 88) 254 657 1231 | 1025 3015 5982 | 825 2502 4928 
01.01 | 110 319! 428 1009 1750 | 1846 4319 7456 | 1507 3540 6084 





Methods devised by Wald and Wolfowitz [2] are easily adapted to the ap- 
proximate calculation of the probability (6). 

Table I presents minimal values of the sample size N to effect the control (6) 
for various values of the constants p, d, , d, and a. The indication is clear that 
the prediction of probability intervals based upon the estimates 7 and s from small 
samples is not very reliable. 

2. The expectation of A and the probability (6). Writing u = (Gj — m)/o 
and v = s/o, A(j, 8; \) becomes 


nutr0 
412 


1 
ao | ¢ dt. 
V 2e Suro 
It is well known that the variables u»/N and (N — 1)’ are independently dis- 
tributed, the first being normal with zero mean and unit variance and the second 


(7) A*(u, v; 





598 G. E. ALBERT AND RALPH B. JOHNSON 


being chi-square with N — 1 degrees of freedom. One readily derives (Wilks [1]) 


. NY 
E(A) = P < ane t, 
(A) r{it <1 4/ 545 | 


where t has Student’s distribution with N — 1 degrees of freedom. Setting this 
equal to 1 — p, the choice (5) for A is obtained. 

To calculate the probability (6), one integrates the joint frequency function 
f(u, v) over that portion of the half plane -— » < u < «,v > 0 on which 
1—p-—d < A* <1— p+ ad. To perform the integration, one proceeds 
as in Wald and Wolfowitz [2] where a similar problem is solved. Define two 
functions 


(8) vp = v,(u), 
by the equations 
(9) A*(u,vp;4) = 1 — p+ (-1)'d,, r = l, 2, 


where A* is defined by (7) and J is given by (5). The functions v,(u) are monotone 
increasing relative to | w |. It follows that 


(10) Pril-—-p-—d < A(j,s;\) [<1 — p+ as} = xf et *™*P(u)du, 
aT J—x 


where 


(11) P(u) = Pr{(N - L)vj(u) <x < (N — L)v3(u)}, 
x being distributed as chi-square with N — 1 degrees of freedom. 

The formulas (10) and (11) are too unwieldy for much computation. Following 
Wald and Wolfowitz [2] again, one can show that a good approximation for 
large N is 


(12) Pril — p — di < A(j, 83) < 1 — p+ da} & P(N”), 


the right member being given by (11). 

3. Computational procedure. For a given set of values of p, di, and d2, one 
may now tabulate (12) against N by the following steps. Using A as given by 
(5), the v, = v,(N~*) defined by (9) are found by trial and error from a standard 
normal distribution table. Then (11) and (12) give the control probability. One 
easily picks out the minimal N for which (6) is satisfied. Tables of the incomplete 
gamma function [3] are available and the authors are in possession of graphs of 
the chi-square distribution prepared from these tables by the use of spline 
curves. The detail of the graphs is sufficient for three-decimal accuracy in reading 
probabilities. For small values of N and values beyond the range of tables, a 
variety of standard methods of approximation for (11) were used. 

Lower and upper bounds for the interval (10) are easily devised using obvious 
approximate quadrature methods. See Wald and Wolfowitz [2] in this connec- 


i] 


tion. The small values of N in Table I were checked by such a device. The 





ESTIMATION OF CENTRAL INTERVALS 599 


authors are confident that the computation was sufficiently accurate to make 
the table useful for practical purposes. 

4. Generalization. The formulation of the problem discussed above may be 
generalized to the case in which the mean m of the distribution (1) depends 
linearly upon k sure variables z,, x2, --- , 2%. The N 


observations are then 
N (k + 1)-tuples (y;; 21, Ze, +--+, Za), 7 = 1, 2, --- , N, and the mean has 
the form 


b 
m=a+t >, 6(X; — %) 
a » j=l 
for an arbitrary set of values X,, X2, --- , X, of the sure variables. Referring 
to Cramér ([4], pages 551 and 552) for notations and formulas in order to save 


space here, one replaces the interval estimate (7 + As) above by the interval 
from R, to Rz , where 


R, a* o 2d, 85(X; —_ %;) + (—1)’\*e*, 


A* = 


A 
M=14 > 44 (x,— 2X; — 8). 
i, jal L 

Here ¢, is chosen as in (5) except that the degrees of freedom are now N — k — 1. 

For this generalization, when N/M is large, the control probability (6) is 
approximated by P(M*/N') where P(u) is given by (11). Organized computa- 
tion for this generalization does not seem feasible since the values of the quad- 
ratic form M may vary greatly from one application to another. 


REFERENCES 
[1] S. S. Wixks, ‘‘Determination of sample sizes for setting tolerance limits,’’ Annals of 
Math. Stat., Vol. 12 (1941), pp. 91-96. 


{2} A. WaLp ano J. Wo.trowrtz, ‘‘Tolerance limits for a normal distribution,’’ Annals of 
Vath. Stat., Vol. 17 (1946), pp. 208-215. 

[3] K. Pearson, Tables of the Incomplete Gamma Function, Cambridge University Press 
1922 


[4] H. Cramtr, Mathematical Methods of Statistics, Princeton University Press, 1946. 





600 A. W. KIMBALL 


ON DEPENDENT TESTS OF SIGNIFICANCE IN THE ANALYSIS 
OF VARIANCE! 


By A. W. KimBatui 
Oak Ridge National Laboratory 


1. Introduction. Some statisticians and other practitioners of the analysis of 
variance have expressed concern over the fact that many experimental designs 
lead to multiple tests of significance which are not independent in the proba- 
bility sense. Factorials, latin squares, lattices, etc. have the advantage of ena-- 
bling a research worker to test several hypotheses in one experiment, but all tests 
ordinarily depend on the same estimate of population variance. It is argued 
that whatever error is present in this estimate for a particular experiment will 
affect all tests of hypothesis in the same manner, and one tends either to accept 
or reject a large proportion of the hypotheses when the population variance is 
respectively overestimated or underestimated. The difficulty can be avoided by 
performing a separate experiment for each hypothesis to be tested, but this 
would contradict the whole philosophy of experimental design. 

This paper deals with an attempt to evaluate the effect of dependency among 
the tests of significance when each experiment is treated as a unit regardless of 
the number of hypotheses tested per experiment. From this point of view if all 
null hypotheses are true, an error is committed if one or more of the hypotheses 
are rejected. It is shown that the probability of making no errors of the first kind 
in one experiment is greater when the tests are dependent than when they 
are independent. For those who prefer this way of looking at the problem, the 
doubts expressed in the first paragraph should be dispelled. The situation in 
which risks are calculated using the hypothesis rather than the experiment as a 
unit is not considered. 

In the following sections it is assumed that samples are taken independently 
from normal populations having the same variance and having means additively 
related in a manner defined by the design of the experiment. These are the usual 
assumptions associated with analysis of variance models in which the parameters 
are population means (as distinguished from components of variance models). 

2. Case of two dependent tests of hypothesis. We shall consider first the case 
of an analysis of variance in which two hypotheses are tested using the same 
error variance for each test. A well known example of this case occurs in the 
analysis of variance with two criteria of classification where the effects of both 
rows and columns are to be tested. In the usual cases, formulation as a general 
linear hypothesis leads to three quadratic forms, q,, g2, and q;, which are 
independently distributed as x with mn, , m2, and n; degrees of freedom, respec- 


tively” The likelihood ratio. statistics for testing the two hypotheses are then 


and 


1 This work was begun while the author was at the USAF School of Aviation Medicine, 
Randolph Field, Texas 


2 For a more complete statement, see [1], p. 177. 





DEPENDENT TESTS 601 


If the critical region for the rejection of each null hypothesis is of size a, the 
probability of making no errors of the first kind is given by 


PIF, S Pra, Fs < Fea}, 


where F;, and F2_ are the 100a per cent points of the distributions of F, and 
F, , respectively. We shall prove’ that 


(1) PIF, S Fia, F2 S Fea} > P{Fi < Fia}-P{F2 < Fea}. 


Since g:, gz, and q; are independent, their joint density is the product of 
three x° densities. Clearly (1) may be written 


(2) P\q. < kgs i/o kegs} * Pin < kis} *P\qe < kegs}, 


where k) = mFia/n3, kz = nF 2q/n3. Expressed in integral form, (2) become 


ee = 


(3) fi(qs) fo(qs) fa(qs) dq; > I Fi(qs) fs(qs) dq; | fo(qs) fa(qs) dqs, 
J ( 0 ’ 

where for i = 1 or 2, fi(gs) is the integral from zero to k,q; of a x° density with 
n; degrees of freedom, while f;(qs) is the x° density function with n; degrees of 
freedom. Since fi(qg;) and fe(q3) are positive strictly monotonically increasing 
functions of g;, and f3;(g3) is a density function, (3) may be written 

(4) Elf qs fol qs)| > E\fi(qs)} . El fe\ qs) |, 

where the expected values are taken over the probability distribution of x’. 

The inequality expressed in (4) may be proved as a special case of the following 
theorem." 

Tuerorem. If f(x) > 0 and g(x) => 0 are both strictly monetonically increasing 
functions of a random variable x having the probability density h(r) (0 < x < ~), 
and if both f(x) and g(x) have finite expectations, then 

Elf(a)g(x)| — Elf(a)]-Elg(x)| > 0. 
Proor. We may write 
El f(x) g(x)| — El f(x)]-Elg(x)] = [ f(x) { g(x) — Elg(x)]}h(x) da 
“0 
= f, 
say. Because of the monotonicity of g(x), there must exist a quantity x) > 0 
such that g(a) = E[g(x)|. 1t follows that 


z) 
I | f(x) { Elg(x)] — glx) }h(x) dx 


e@2 


+ | f(x){ gla) — Elg(x)]}h(x) dx 


=-[],+ Is, 


3 The trivial cases in which either fF), or F2g or both are either zero or infinite are excluded. 
¢ The author is indebted to Dr. Max Halperin for the proof of this theorem 





602 A. W. KIMBALL 
say. Since 
oO 
[ {g(z) — Elg(x)\\h(x) dx = 0, 


we must have 


a / (g(z) — E[g(z)|}h(x) dz -/ g(x) — Elg(x)|}h(x) dx 


= J, 


say. Furthermore, since f(z) is a strictly monotonically increasing function of 
x, it follows that 


q; < f(x) J, I; > f(a) J. 


Therefore, J; — J; = I > 0, and the theorem is proved. 

It is obvious that the foregoing theorem may be applied directly to prove 
the validity of (4). This in turn verifies (1). 

Although the proof in this section was introduced by reference to a specific 
model in the analysis of variance, it is clearly valid for any two F-tests of sig- 
nificance which satisfy the relationships with respect to q;, gz, and q;, and in 
general for any nm, , m%, and n;. 

3. Extension to several dependent tests of hypothesis. The extension of (1) 
to more than two tests of significance is straightforward. If there are three 
F-+tests, we must show that 
| fo(qs) fu(qs) fo(qs) fa(qs) dgs 


' 
~0 


a® a2 a 
> | folqs) fa(qs) dqs | fils) fa(qs) dgs | fa(qs) f(s) das, 
0 “0 “0 
where fo(gs) is a function similar to fi(q3) and fo(g3) resulting from the third test 
of significance. From Section 2 we know that 
a a® = 
(6) | folqs) fulgs) folqs) faqs) aqs > fol qs) fa(qs) dys | Fi(qs) F2(Qs) faqs) dgs, 
0 “0 0 
since fo(qs) and f1(q3)fe(qs) satisfy the requirements of the theorem. But from 
(3) we may make an obvious substitution in the right-hand side of (6) which 
reduces it to (5). Clearly this simple procedure may be repeated as often as 
necessary to prove the extension of (1) to any number of F-tests of significance 
in which the numerators of the test statistics are mutually independent, and 
each is independent of the denominator which is the same for all statistics. 
The author wishes to thank Professors J. W. Tukey and H. Levene for helpful 
suggestions in the preparation of this manuscript. 


REFERENCE 


[1] S.S. Witks, Mathematical Statistics, Princeton University Press, 1946. 





CONFIDENCE AND TOLERANCE INTERVALS 


ON A CONNECTION BETWEEN CONFIDENCE AND 
TOLERANCE INTERVALS 


By Gottrriep E. NorrHer 


Boston University 


The purpose of this note is to point out the close connection which exists 
between confidence intervals for the parameter p of a binomial distribution and 
tolerance intervals. 

Let k be the number of successes in a random sample of size n from a binomial 
population with probability p of success in a single trial. Then it is well known 
that a confidence interval with confidence coefficient at least 1 — a, — ay for 
the parameter p: is given by 


(1) pilk) < p < pr(k), 


where pi(k) and p.(k) are determined by J,,a@)(k, n — k + 1) = qm and 
They. (n — kik +1) = 1 —TI,,a(k + 1,n — k) = ag, respectively, [,(a, b) = 


{f(a + b) (r(a)r)) | u’—‘(1 — u)’* du being the incomplete B-function. 
0 


Let X,, ---, X, represent a random sample of size’n from a population 
having continuous cdf F(z). For simplicity assume that the X’s are already 
arranged in increasing order of size and define Xp» = —~, Xn4, = +0. The 
coverage provided by the interval (X;, Xiu), 7 = 0, 1, ---, n, is called an 
elementary coverage.' If we then let U, stand for the sum of r elementary cover- 


ages, U, > U,(a) unless an event of probability a has occurred, where U,(a) 
Ur(@) 

is defined by a = [['(n + 1)/(T(r)T(n — r + wi f u(1 — u)"" du = 

0 


Iv,ca)(r,n — 7+ 1). 

In this notation (1) becomes 

Ui(aar) < p < Ural — ae). 

Thus the lower end point of a confidence interval for p on the basis of k ob- 
served successes is determined by the corresponding lower limit for the sum of 
k elementary coverages, while the upper end point is determined by the cor- 
responding upper limit of the sum of (k + 1) elementary coverages. The reason 
for this becomes obvious if we look at the k successes as the observations 
X,,°::, X, which are smaller than the p-quantile g, of F(x), so that the cover- 
age U, of the chance interval (Xo , X;) provides an “inner” estimate of p, while 
the coverage U;,, of the chance interval (Xo , X41) provides an ‘‘outer’’ estimate. 

We may ask what kind of a confidence interval we obtain if we consider as 
successes the k observations belonging to an arbitrary interval J for which 


[ arc) = p, as long as J does not coincide with either (— ©, g,) or (qi-p , + ©). 
I 


| For rigorous definitions and formulas see, e.g., Wilks [1], p. 13. 





604 ABSTRACTS 


It is easily seen that an “outer” estimate of p is still given by U,4,. However, 
an “inner” estimate is now given by U;_;, leading to a lower end point of the 
confidence interval which is unnecessarily small. 

The method of obtaining a confidence interval for p discussed in this note is 
in a certain sense the reverse of the method discussed in an earlier paper of the 


author [2]. There it was shown how confidence intervals for p can be used to 
obtain confidence intervals for quantiles, which then can be used to obtain 


tolerance intervals. 


REFERENCES 
[1] S.S. Wixks, “Order statistics,’ Bull. Am. Math. Soc., Vol. 54 (1948), pp. 6-50. 


[2} G. E. Noreruer, ‘“‘On confidence limits for quantiles,’ Annals of Math. Stat., Vol. 19 
1948), pp. 416-419 


(aR 


ABSTRACTS OF PAPERS 


(Abstracts of papers presented at the Minneapolis meeting of the Institute, September 4-7, 1951) 


1. On Stieltjes Integral Equations of Stochastic Processes. Marta CASTELLANI, 
University of Kansas City. 


This paper considers two methods of solving certain S-integral equations. 
a. A Fredholm-Stieltjes integral equation of generating functions. We give the F-S integral 


‘ ° . > i > 
equation | A(s, x) dg{z) = f(s), where A(s, x) = X42, ax(x)s~* and f(s) = La,s* for s > 
“E 
¢(s) and ay = 0 if k = 0. Let us assume that u(r) and v(z) are respectively solutions of 


| A(s, z)-A(—s; , 2) du(z) = 1/(S — 8) and | A(s, x) dv(x) = 0. If we consider 
VE E 


| A(s, r)A(—s,, z)f(s.) du(x) = f(si)/(S — S)) 
YB 


and if y(z) is the coefficient of —1/S, in the serial expansion of A(—s, z)f(s:), then under 
fairly general conditions the required solutions are given, almost everywhere, by g(z) = 


az z 
const. | dv(x) + / (2) du(z). The proof is based on a Murphy D’Arcais linear operator 


and on the p operation of S-integrals. 
b. A Volterra-Stieltjes integral of recurrent random functions. Let us have over a time 


interval (7,t) an unknown rfé(t — 7) satisfying the following recursive equation: 6(f — r) = 


é(r) — [ 6(x2 — r)p(x) dF (x) where F(z) is a df and p(x) is bounded. We assume the inter- 
“?T 

val divided into m parts and also that the set of the n discrete values of 6 satisfy the follow- 

ing relation: 6(t — r)/(r) = II°_) (1 — p(s)AF(s)). If F = F, + F2, where the F, is a con- 

tinuous function and F; is a jump function over a set S of points, then by a generalized 

method of Cantelli, taking finer and finer partitions, we obtain as a limit 6(f — r)/8(r) = 


a 
exp (- o(x) dF »)| II.-s(1 — p(s) dF2(s)). This gives almost everywhere the re- 


quired solutions. 





ABSTRACTS 605 


2. An Unfavorable Aspect of the Liklihood Ratio Test. L. M. Court, Rutgers 
University. 


The likelihood ratio test has many desirable properties. For example, it is not only 
consistent, but as Wald has shown, uniformly consistent. Still, it can at times be a poor 
test, e.g., under certain circumstances when the size of the test region is properly selected, 
the probability of rejecting the hypothesis to be tested when it is true exceeds the probability 
of rejecting it when any alternative is true. Both Stein and Rubin have given examples of 
this. Stein’s example, quoted by Lehmann in his notes on ‘‘Testing Hypotheses”’ (pp. 1-5) 
consists of a family of discrete distributions (five-valued, to be precise) in which a simple 
hypothesis is tested against a composite alternative. The writer, using a geometrical con- 
struction, gives an example (actually, a broad class of examples, much broader than Stein’s 
which is a 2-parameter class) in which the distributions are continuous. The hypothesis to 
be tested is first simple, then composite; the alternative, always composite. 


3. Impartial Decision Rules and Sufficient Statistics. Racuu Ras BAHADUR AND 
Leo A. GoopMaAN, University of Chicago. 


In the following, (1) refers to the paper ‘“‘On a problem in the theory of k populations,’’ by 
R. R. Bahadur (Annals of Math. Stat. Vol. 21 (1950), pp. 362-375). The present paper pro- 
vides certain improvements of the main result contained in (1). The authors define the class 
of impartial decision rules in terms of permutations of the k samples (rather than in terms 
of the k ordered values of an arbitrarily chosen real valued statistic (cf. (1)). This definition 
is intuitively more appealing than the one adopted in (1), and permits a unified treatment 
of discrete and absolutely continuous populations. The authors show that if the same 
function is a sufficient statistic for each of k independent samples of equal size, then the 
conditional expectation given the sufficient statistics of an impartial decision rule is also 
an impartial decision rule. They also give a characterization of impartiality which relates 
the present definition to that of (1). These results, together with Theorem 1 of (1), yield 
the desired improvements. An illustration of the argument indicated here is given. 


4. Contributions to the Statistical Theory of Counter Data. G. E. Ausert, 
University of Tennessee and Oak Ridge National Laboratory, anp M. L. 
NELson, Oak Ridge National Laboratory. 


Let a sequence f of events be such that the number occurring in time T is a chance vari- 
able having a Poisson distribution with mean a7’, a > 0. A counting device generates a 
new sequence g since, due to a resolving time u > 0, it fails to record all of f. An event in 
f (i) can be recorded only if none has been recorded during a time u preceding it, (ii) will 
be recorded if it follows its predecessor in f by more than time u, (iii) either will be recorded 
with probability p or not recorded with probability 1 — p if it follows its predecessor in f 
by time < u. The choices p = 0 and p = 1 give the so called Type I and Type II counters 
respectively. The distribution theory of g is obtained as a generalization of the Type II 
theory given by Feller in ‘‘On probability problems in the theory of counters,’’ Courant 
Anniversary Volume, 1948. The Cornish-Fisher normalization is applied to obtain confi- 
dence intervals for the constant a from observations on g of either the time to a preassigned 
count or the count to a preassigned time. These intervals turn out essentially independent 
of p whenever the product au is small; thus the Type I theory reported at an earlier meeting 
covers most of the cases of practical importance. 


5. On the Use of Wald’s Classification Statistic. Harman Leon Harrer, 
Michigan State College. 





606 ABSTRACTS 


In 1944 Wald published a paper introducing the statistic V and giving a general outline 
of its use in problems of classification. Recently the author published a paper giving the 
distribution of V in various cases. The present paper takes the form of an effort to relate 
the two earlier papers and apply the results of the latter. The technique of classifying an 
individual into one of two groups is studied in detail. Let the individual under considera- 
tion belong to the population z, which is known to be identical with one or the other of two 
known populations 7, and 7; . Then one may wish to test the hypothesis Hi:r = m, 
against the alternative hypothesis H2:7 = zw: . The values of the statistic V under these 
two hypotheses are given, and a method is outlined for testing H; against Hz , where an 
error of the second kind is k times as costly as an error of the first kind. A numerical ex- 
ample is given for the univariate case, for which the distribution of V is given in the author’s 
earlier paper. The same procedure can be applied in the multivariate case when the dis- 
tribution is known. 


6. Polynomial Determination in a Field of Integers Modulo P. Epwarp C- 
VarNnuM, Barber-Colman Company. 


From a study of integers mod 2 applied to on-off relay circuits, a generalization to any 
prime modulus, p, has been made to construct a p* X p* matrix by which a polynomial in 
n variables may be determined when the p* values of the polynomial are given for all the 
combinations of the p values of each of the n variables. 


7. About Some Symmetrical Distributions from the Perks’ Family of Functions. 
JosepH TaLacKko, Marquette University. 


The Perks’ system of functions includes a family of symmetrical nonnormal distribu- 
tions, from which two probability densities are of growing interest in theoretical statistics: 
the Verhulst’s distribution (logistic distribution) and the hyperbolic cosine distribution. 
In the first part of this paper properties of this family of probability functions are dis- 
cussed and the characteristic functions for Verhulst’s and the hyperbolic secant distribu- 
tions introduced. The Verhulst’s probability function f(t) = de-**/(1 + e-**)? has C(v) = 
(xv/5) cosech (xv/8), and the hyperbolic secant probability function g(t) = (28/1) (1/(e** + 
e-**)) has C(v) = sech (#v/(25)). The second part is concerned with some previously uninves- 
tigated distributions of certain statistics for samples from Verhulst’s population. In par- 
ticular, distributions of sample means and sum of squares are discussed. In an appendix 
a table of numerical values of Verhulst’s functions is given. 


8. A Large-Sample Test for the Variation of Sample Covariance Matrices- 
DayLe D. Rippe, University of Michigan. 


A test criterion is developed to determine whether a given sample covariance matrix 
could be obtained as a result of taking a random sample of size N from a k-variate 
normal population with a given covariance matrix. The test is based upon the fact that 
the maximum likelihood estimate of the population covariance us; is %i; = my = 
(N — 1)7 - lll (ig — F)(2;, — %;). The test criterion for large samples is \ = 


( 


(N — 1)(n | ui — In | m; + 2; 2; utim;; — k), where d is distributed as a chi- 


square with 3k(k + 1) degrees of freedom minus the number of independent linear restric- 
tions among the variables (m:; — ui;). The results of the application of this criterion to 
the sampling problems in correlation theory compare favorably with exact sampling re- 
sults, and the range of application is extended considerably. The criterion is sufficiently 
general in application to furnish a large-sample test for completeness of factorization in 
matrix factorization (or factor analysis) for the case of complete initial estimates of the 
communalities. It is applicable to any of the common orthogonal forms of solution. The 





ABSTRACTS 607 


degrees of freedom of the chi-square involvsd is 4(k — s)(k — s + 1) afters of the total of 
k possible common components have been removed. The test may also be applied to deter- 
mine the significance of component loadings in the common factor solution. 


9. Probability Models for Analyzing Time Changes in Attitudes. T. W. ANpER- 
son, Columbia University. 


Statistical inference in Markov chains is studied with particular application to data 
in which the finite number of states are states of attitudes of individuals in ‘“‘panel sur- 
veys.”’ Each individual’s sequence of states over a finite number of time points is considered 
as an observation from a Markov chain with the same stationary transition probability 
matrix P = (p;;).n;(0) individuals hold state i at the time origin. The maximum likeli- 
hood estimate of P is obtained. Tests are obtained for the hypothesis that P is a given 
matrix (or that certain elements are given numbers) and for the hypothesis that the transi- 
tion matrix is stationary against alternatives that it varies over time. Extension of results 
to higher order cases is straightforward. A test of the hypothesis that the process is first 
order against the alternative that it is second order is given. When the state is defined in 
terms of two attitudes, a test is given for the hypothesis that the two attitudes change 
independently of each other. Asymptotic distributions of the estimates and of the test 
criteria are obtained under the assumption that n;(0) — «. 


10. The Variance of a Weighted Average Using Estimated Weights. Pau. 
Meter, Princeton University. 


In various experimental designs (e.g., lattices) the problem of estimation involves 
averaging of two or more means with different variances. The proper weights (invariances) 
must be estimated from the experimental data. This problem has been treated by Cochran 
for the case of a large number of samples of equal size. We consider the case of two or more 
samples and find adjustments of order O(21/n;), both for the increase in variance and 
the bias of estimating the variance. Bounds on the increase in variance due to the use of 
estimated weights are given. Exact computations are made for several special cases. For 
the case of two means with weights based on ten degrees of freedom the adjustments re- 
duce the maximum bias from approximately 15 per cent to less than 2 per cent. 


11. Distribution of Ratios of Quadratic Forms. Jonn Gurianp, University of 
Chicago. 


The problem of finding the distribution of a quadratic form and of a ratio of quadratic 
forms in normally distributed random variables is considered. By transforming the prob- 
lem of ratios into one of linear combinations of independent variables each having a x? 
distribution, a solution is given in terms of Laguerre polynomials which is more general 
than that of Pitman and Robbins (‘‘Application of the method of mixtures to quadratic 
forms in normal variates,’’ Annals of Math. Stat., Vol. 20, 1949, pp. 552-560). The conver- 
gence of the expansion is established, and a new system of polynomials is suggested which 
would afford a solution for all distributions of quadratic forms and ratios of quadratic 
forms in normally distributed random variables. Once the convergence of expansions in 
terms of the new system polynomials is established, the system will be applicable in a 
much wider class of distributions than the Gram-Charlier series. 


12. The Large-Sample Power of Tests Based on Permutations of Observations. 
(Preliminary Report.) Wasstty Hoprrpinc, University of North Carolina. 


i(z; , «++ , ty) be the usual t-statistic for testing whether two samples 2; , --- , 2» and 


The results of this paper can be illustrated by the following example. Let {(z) = 





608 ABSTRACTS 


Im41,°** , Zw Came from the same (normal) population. The critical region of the stand- 
ard test of size @ is of the form | ¢t(z > dAw . As N — ~, Ay approaches a value A = A(a@). 
A nonparametric test proposed by E. J. G. Pitman can be described as follows. Let 
ze +. , 2%) be the N! permutations of the sample values zr = (2 , --: , ty), 80 num- 
bered that [t(z) | > --- > | t(z YD) |. Let k be the largest integer <N! a + 1. Then 
the hypothesis that the two samples came from the same (arbitrary) population is rejected 
if and only if | ¢(x) | > | t(r , and the size of the test is <a. The critical value 

t(z) | = Ay , say, is a random variable. It is shown that as NV — ~, \y tends in prob- 
ability to \ under general conditions which cover the case of two samples from two normal 
populations. It follows that in large samples the power of the nonparametric test approaches 
that of the standard parametric test. Similar results hold for tests of certain linear hypoth- 
eses, the correlation coefficient test, etc. (Work sponsored by the Office of Naval Research. 


13. A Complete Class of Decision Procedures for Distributions with Monotone 
Likelihood Ratio. Herman Rusrn, Stanford University. 


z 
Let P(X < z\|r) = [ f(z, r) du(x), where r lies in some interval of the reals, and if 


ws 
of the exponential family, where f(z, r) = w(r) exp (x, r). Suppose the terminal decision 
d ranges over a closed subset D of the reals. Then if the loss function W(d, r) satisfies cer- 
tain monotonicity restrictions (which are usually met in multiple decision and estimation 
problems), a complete class of decision procedures based on a single observation are those 
which are unrandomized, except possibly at jumps of u, and are monotone. 


Z > 2,71 > 2, then f(x, , ri)f(te , 72) — f(xe , ri) f (a1 , t2) > O. This is a generalization 


14. Some Nonparametric Results for Experimental Designs. Jonn E. Watsu, 
Bureau of the Census. 


In experimental designs, the quantities investigated are often grouped into blocks as a 
method of obtaining a higher precision for the experiment. This grouping may result in 
high correlation among observations within the same block. Also there may be substantial 
variance differences between blocks. Then the ¢-statistic is not necessarily applicable for 
comparing the effects of the treatments under investigation. This paper presents some 
nonparametric results which are usually valid for a well known type of experimental design 
if there is statistical independence among blocks (number of blocks >4). These nonpara- 
metric results are reasonably efficient, compared to those based on the (t-statistic, for the 
case where the totality of observations are independent, normally distributed, and have 
the same variance. High precision can sometimes be obtained by designing the experiment 
to yield large positive correlation within blocks and then using the nonparametric results. 


15. Efficient Tests and Confidence Intervals for Mortality Rates. Joun E. 
WatsuH, Bureau of the Census. 


This paper presents large-sample tests and confidence intervals for the ‘‘true value’’ of 
a mortality rate based on insurance data. These results have efficiencies of nearly 100°%; 
i.e., they utilize nearly all the “information”’ contained in the data. The procedure used 
in obtaining the tests and confidence intervals consists in constructing a suitable t-statistic. 
The construction requires that the data be divided into between 300 and 400 statistically 
independent subgroups of approximately the same size. One possible way of accomplishing 
this is by subdividing the data according to the first three letters of the last name of the 
person insured and then appropriately combining the resulting groups. The amount of 
work required in applying the results of this paper is not appreciably greater than that 





ABSTRACTS 609 


required in obtaining the usual point estimate of the mortality rate; in fact, a procedure 
which yields the point estimate as a byproduct is followed. 


16. Sufficient Statistics when the Carrier of the Distribution Depends on the 
Parameter. D. A. S. Fraser, University of Toronto. 


A “statistic of selection” is defined by a mapping from the space of the distribution to 
the space of Borel sets over that space. This statistic is sufficient if the parameter is a 
“parameter of selection,” that is, if the parameter @ determines only the carrier of the 
distribution, the relative density being independent of the parameter. For more general 
distributions a theorem in this paper facilitates obtaining sufficient statistics, subject to 
continuity conditions. 


17. Bayes Solutions and Likelihood Ratio Tests of Some Simple and Composite 
Hypotheses. (Preliminary Report.) ALLAN Brrnsaum, Columbia University. 


Let Ho be a hypothesis concerning the density function ps(e), to be tested against the 
composite alternative H, , by means of the acceptance region A in the space of the minimal 
sufficient statistic ¢(e). For various distributions in which ¢(e) is not real-valued, necessary 
and sufficient conditions are givenfor A to be a Bayes solution or the limit of a sequence 
of Bayes solutions. The likelihood ratio test, for a wide class of simple and composite 
hypotheses, is proved to be the limit of a sequence of Bayes solutions. A condition which 
is necessary and sufficient for the admissibility of the likelihood ratio test is derived. The 
distributions considered include: (1) pe(e) of general Koopman-Darmois form; the result 
here is applied to various examples. (2) pe(e) rectangular; generalizations of this result are 
indicated. Methods of approximating these tests are discussed. Applications to problems 
oi ‘‘combining’’ independent significance tests are made; a definition of admissibility of 
methods of combination is proposed, according to which some current methods are inad- 
missible; a minimax multidecision procedure is proposed and developed, to replace certain 
current methods of combining tests. 


18. The Impossibility of Certain Affine Resolvable Balanced Incomplete Block 
Designs. 8S. 8. SHrikKHANDE, Nagpur College of Science, India. 


Three theorems on the impossibility of an Affine Resolvable Design (R. C. Boss, ‘“‘A 
note on the resolvability of balanced incomplete designs,’’ Sankyhd, Vol. 6 (1942), pp. 
105-110) with parameters» = nk = n2(n — 1)t+ n?,b = mr = n(n*t+n4+1),\ = nt +1, 
with n > 2,t>0 (n and ¢ integral) are proved. THEOREM 1. An Affine Resolvable Design with 
the above parameters does not exist when n and t are odd and (i) n((n — 1)t + 1) is not a perfect 
square, or (ii) n((m — 1)t + 1) ts a perfect square and nt + 1 = 2 (mod 4), and the square- 
free part of n contains a prime of the form 41 + 3. THEOREM 2. An Affine Resolvable Design 
with the above parameters does not exist when n is odd and t is even and (i) (n — 1)t + 1 is 
not a perfect square, or (ii) (n — 1)t + 1 is a perfect square and n + t = 1 (mod 4), and the 
square-free part of n contains a prime 41 + 3. THEorREM 3. An Affine Resolvable Design with 
the above parameters does not exist for any value of tif n = 2 (mod 4) and the square-free part 
of n contains a prime 4i + 3. The proofs depend on showing the impossibility of a Group 
Design obtained from the Affine Resolvable Design by making use of results due to Bose 
and Connor (Abstracts No. 4 and No. 6, Annals of Math. Stat., Vol. 22 (1951), pp. 311-312). 
The theorem on the impossibility of finite projective planes (R. H. Bruck anp H. J. Ryser, 
‘‘The nonexistence of certain finite projective planes,’”’ Canadian Jour. Math., Vol. 1 (1949), 
pp. 88-93) is contained here as a particular case. 


19. On Sufficiency and Statistical Decision Functions. Racuu Ras Banapur, 
University of Chicago and Delhi University, India. 





610 ABSTRACTS 


The first part of the paper contains certain characterizations of sufficiency. These re- 
sults are then used to show that the justification for the use of sufficient statistics in sta- 
tistical methodology which was described in an informal way by P. R. Halmos and L. J. 
Savage in the final section of their work on sufficiency (‘‘Application of the Radon-Nikodym 
theorem to the theory of sufficient statisties,’”’ Annals. of Math. Stat., Vol. 20 (1949), pp. 
225-241) is valid whenever the decision space may be taken to be a subset of Euclidean 
k-space. This justification is proved first for the case of an arbitrary but fixed sample space, 
and then generalized to sequential sample spaces. The result for the sequential case may 
be outlined as follows. Let x, , 72 , --- be a sequence of chance quantities having a joint 
probability distribution p belonging to a family P. For each m = 1, 2, --- , let 
tm(i ,%2,°** , 2m) be astatistic which is sufficient when the sample space consists of points 
(41 , 22 -++ , 2m). Then corresponding to any sequential decision function — based on 
Z,,2%2,°-- there exists a sequential decision function 7 based on ¢,;(2:), f2(% , 22), «°° 
such that the joint probability distribution of the sample size and the terminal decision 
is the same under é and » for each p in P. This result holds without restriction (other than 
measurability) on the sampling scheme of £, so that in the special case of point estimation 
with a convex loss function it leads to an enlargement of the domain of Blackwell’s The- 
orem and its generalizations. 


20. A Two Sample Test Procedure. Donatp B. Owen, University of Wash- 
ington. 


In testing hypotheses the standard procedure is to specify a test based on a single set 
of observations. Sequential analysis introduced a new concept: that of making a decision 
after each observation, either to accept the hypothesis, or to reject it, or to take another 
observation. Here an approach is worked out that lies somewhat between these: an initial 
set of observations is taken. Then a decision is made to accept, reject, or take one more 
set of observations. After this second set of observations, a decision on the hypothesis 
must be made. 

The problem is to formulate these decision rules at the two stages of the process, to 
optimize them if possible, and to evaluate the performance of the tests. These depend on 
various parameters and it is pertinent to inquire how these parameters affect the answers 
to the questions noted. More precisely, the problem is to maximize (or minimize) with 


b d 
respect to a, b, c, d expressions of the type: [ f(z) dz, / g(z) dz, subject to side conditions 
a e 

(which are also expressed as integrals). The functions f and g are probability density func- 
tions: here those associated with the normal probability density function. There are two 
main sections, the basis of division being whether the final decision is based only on the 
second sample or on the whole set of observations. The second procedure is the more eco- 
nomical, but mathematically it is much more difficult. Much less complete results are given 
in this section. In the direction of finding the optimum of this type of rule, the results are 
chiefly negative. Several theorems are given which show the difficulties in obtaining a solu- 
tion to this problem For the rules given the performance of the tests is evaluated and 
various theorems worked out concerning them. Some of the lemmas have interest in their 
own right as properties pertaining to important probability density functions. 


21. A Combinatorial Central Limit Theorem. Wasstty Horerrpinc, University 
of North Carolina. 


Let (Yn,°*: , Yan) be a random vector which takes on the n! permutations of 
(1, --- ,) with equal probabilities. Let c,(i, 7), i,j = 1, --- , n, be n? real numbers. Two 
sufficient conditions for the asymptotic normality of S, = S)_, en(i, Ya:) are given. In 





ABSTRACTS 611 


the special case c,(t, j) = an(i)ba(j), which was considered by Wald and Wolfowitz, the 
first condition generalizes a condition given by Noether (‘‘On a theorem by Wald and 
Wolfowitz,” Annals of Math. Stat., Vol. 20 (1949), pp. 455-458). The second condition is 
slightly stronger but simpler as it involves not an infinity of limiting relations but only 
a single one. Applications to the theory of nonparametric tests arejindicated. (Work spon- 
sored by the Office of Naval Research.) 


22. Necessary Conditions for the Existence of a Symmetrical Group Divisible 
Design. R. C. Bose anp W. S. Connor, Jr., University of North Carolina. 


’An incomplete block design with v treatments each replicated r times in 6 blocks of size 
k is said to be group divisible if the treatments can be divided into m groups each with n 
treatments, so that the treatments of the same group occur together in A, blocks, and treat- 
ments of different groups occur together in Az blocks, A, ¥ Az . The combinatorial properties 
and the methods of construction for these designs have been studied by the authors else- 
where (cf. Abstract No. 4, Annals of Math. Stat., Vol. 22 (1951), p. 311). An incomplete 
block design is said to be symmetrical if the number of treatments v equals the number of 
blocks 6, and in consequence k = r. If N is the incidence matrix of a symmetrical group 
divisible design, the Hasse invariate C,(NN’) of the quadratic form with matrix NN’ 
(where N’ is the transpose of N, and pis any odd prime) has been obtained in a simple form. 
Its value is C,(NN’) = (P, d2) 9(P, n)r(P, -—1) "FQ, n)r@Q, MPS baal LS 
where P = r?— vh. ,Q = r— dr, and ie. b), is the Hilbert norm residue symbol. For the 
existence of a symmetrical group divisible design C,(NN’) = +1 for all odd primes p, 
and P™—'Q™(*-)) must be a perfect square. This shows that necessary conditions for the 
existence of a symmetrical group divisible design are (i) if m is even then P must be a 
perfect square, and if further m = 4t + 2 and n is even then (Q, —1), = +1 for any odd 
prime p; (ii) if m is odd and n is even then Q must be a perfect square and ((—1)"nd: , P), = 
+1 for any odd prime p, where a = m(m — 1)/2; (iii) if m and n are both odd then 
((—1)*ndz , P)»((—1)®n, Q)» = +1 for any odd prime p, where a = m(m — 1)/2 ands = 
n(n — 1)/2. The impossibility of a large number of symmetrical group divisible designs 
can be proved by using these conditions. 


23. On a Problem of Mapping of One Space on Another with Applications in 
Sampling Distributions. S. N. Roy, University of North Carolina. 


Denoting a p X n matrix M by M(:p X n), a triangular matrix (with the upper right 
hand corner zero) by T (with elements ¢;;), a diagonal matrix with diagonal elements 
0, , 02, -** , 0p by Dy , and a p X p unit matrix by /(p), consider the transformations 
(i) 2(:p Xn) =T(:p X p) £ (:p X n), where p < n; zis of rank p; LL’ = I(p); ts > 0, 
¢ = 1,2, --> ,®. 

(ii) z(:p X n) = M(:p X p)Dy,y(:p X p)L(:p XK n), where MM’ = M'M = LL’ =I (p); 
p <n; zis of rank p; the first row of M consists of positive elements; @ stands for the 
p positive characteristic roots of zz’, and 4/@ stands for the positive square root of 
0. 

These transformations have proved extremely useful for almost the entire range of 
problems on sampling distributions based on multivariate normal populations. In (i), 
by virtue of LL’ = I(p), we could choose from &, in various alternative ways, a set of 
pn — p(p + 1)/2 independent elements to be called £,; , and in (ii), by virtue of MM’ = 
[(p), a similar set of p(p — 1)/2 from M to be called M, , and by virtue of LL’ = I(p)a 
similar set of pn — p(p + 1)/2from £ to be called £; . In this paper is discussed the nature 
of the transformations (ia), from z to TL, under LL’ = I(p); (ib) from z to TL; (iia) 
from z to 0M L, under MM’ = M’'M = LL’ = 1(p); and (iib) from z to 6M;L;. In this con- 





; 
f 
i 
; 
} 
: 
: 
; 
: 


612 NEWS AND NOTICES 


nection certain problems are also posed for mathematical statisticians which the author 
has not been able to solve so far. 


24. On a Theorem in Jacobians with Statistical Applications. S. N. Roy, Uni- 
versity of North Carolina. 


If Fi(ys 5 +++ 5 Ym» M1, °°* » man) = O (Ct = 1,2, --- , m+n) are such that we could 
select any set of m z,’s (to be called without any loss of generality 2, , rv: , --- , Zm) and 
could find real values of (y; , --- , ym) and of (%m4i , --+ , Zm4n) for real values of 
(a1 ,°*+ , 2m), then, assuming that the numerator and the denominator onthe right-handside 
of the equation below are nonvanishing, and assuming certain other restrictions, we would 
have J(yi, Y2,°°* » Ymit%, 2a,°°* tm) = —[O(F, , --: , Pmin)/O(21 , -*- , Sman)) 
[O(Fi , --- , FPman)/O(Yr , +++ 5 Ym» Lmat, *** » Sman)], where absolute values of the de- 
terminants are to be taken. Important special cases of this general theorem with various 
statistical applications are discussed in this paper. 


25. The Inventory Problem. A. Dvorerzkxy, J. Krerer, anp J. WoLFrow!Tz, 
Cornell University. 


The inventory problem is the general problem of what quantities of goods to stock in 
anticipation of future demand, where loss is caused by inability to supply demand or by 
stocking goods for which there is no demand. Let z; be the initial stock of a given com- 
modity in the ith interval (¢ = 1, --- , N) before any ordering is done, and y; the starting 
stock after an amount y; — x; > 0 has been ordered and instantaneously received by the 
stocking agency. The amount demanded in the ith interval is a chance variable with known 
distribution function F; . W(x; , yi , di) is the loss incurred in the ith interval when 2; 
is the starting stock, y; the initial stock, and d; is the amount demanded in this interval. 
F; , W; , and the expected value of W; with respect to the demand may also be functions 
of the ‘“‘past history” as given by 8; = {2; , yj;,d; : j < i}. An ordering policy Y is a set of 
functions Y;(2; , 8;) (¢ = 1, --- , N), where one orders an amount Y;(z; , 8;) — x; in the 
ith interval. With each Y and z, there is associated a quantity A(Y | 2) which is the total 
expected loss over all intervals (the loss in the ith interval being discounted by a factor 
1 — a;) when Y is used and 2; is the initial stock in the first interval. An optimal (e-optimal) 
policy is one which minimizes this quantity (within e) for every x2, . A method for con- 
structing such policies is given. The case of an infinite number of intervals is similarly 
treated. Analogous results are obtained in more general cases, e.g., when there is a time 
lag between the ordering and delivery of goods, when there are several commodities, etc. 

The second part of the paper deals with the case when the set of distributions F; is 
known only to be a member of a certain class 2. Constructive methods for obtaining Bayes 
solutions and complete classes are given. (This research was sponsored by the Office of 
Naval Research.) 


I 


NEWS AND NOTICES 
Readers are invited to submit to the Secretary of the Institute news items of interest 


Personal Items 
Mr. Fred C. Andrews, Teaching Assistant and Research Assistant, Statistical 
Laboratory, University of California, Berkeley, was promoted to Lecturer and 
esearch Assistant effective July 1, 1951. 
Dr. Dorothy 8. Brady, formerly Professor of Economics at the University of 





NEWS AND NOTICES 613 


Illinois, has returned to Washington as Consultant to the Commissioner on Costs 
and Standards of Living, Bureau of Labor Statistics. 

Dr. C. West Churchman has taken a leave of absence for one year from 
Wayne University to accept a visiting professorship in the Engineering Admin- 
istration Department, Case Institute of Technology, Cleveland, Ohio, to do 
work in operations research as applied to industry. 

Mr. Edward L. Corton, Jr., formerly employed at the Naval Ordnance Train- 
ing Station, China Lake, California, is now working as a meteorologist for the 
Navy Hydrographic Office, Washington, D. C. 

Dr. Arthur M. Dutton, after receiving his degree in June from the Iowa State 
College, accepted a position with Professor 8. L. Crump at the University of 
Rochester Atomic Energy Project, Rochester, New York. 

Dr. Evelyn Fix, Instructor and Research Associate, Statistical Laboratory, 
University of California, Berkeley, was promoted to Assistant Professor and 
Research Associate, effective July 1, 1951. 

Mr. Charles P. Gershenson accepted the position as Research Director of the 
Jewish Children’s Bureau of Chicago after working for five and one-half years 
as Research Associate at the Institute of Psychological Research, Teachers Col- 
lege, Columbia University. His new job entails the development of a research 
program for evaluating the effectiveness of small residential units for the treat- 
ment of emotional disturbed children. 

Dr. E. J. Gumbel has been appointed Consultant to the Applied Mathematics 
and Statistics Laboratory, Stanford University, Stanford, California, for work 
on industrial statistics. 

Dr. Louis Guttman, formerly Professor of Sociology at Cornell University, 
has accepted a position with the Israel Institute of Applied Social Research, 
Shell Building, Julian’s Way, Jerusalem. 

Dr. Paul R. Halmos will be on leave of absence from the University of Chicago 
for the academic year 1951-1952. He will serve as Visiting Professor of Mathe- 
matics at the University of Montevideo under the auspices of the State Depart- 
ment’s Division of International Exchange of Persons. 

Dr. J. L. Hodges, Jr., Assistant Professor and Research Associate, Statistical 
Laboratory, University of California, Berkeley, will be on leave of absence in a 
visiting capacity at the University of Chicago during the academic year 1951- 
1952. 

Mr. Daniel G. Horvitz has accepted an assistant professorship of Biostatistics 
at the University of Pittsburgh School of Public Health. 

Mr. William C. James has left Washington to serve with the National Office 
of Vital Statistics, Public Health Service, in Lima, Peru. Last fall he was ap- 
pointed as international consultant in vital statistics under the international 
vital statistics cooperation program of this department and was in Washington 
for several months training for this work. 

Mr. T. A. Jeeves, Lecturer and Research Associate, Statistical Laboratory, 
University of California, Berkeley, has been promoted to Instructor and Re- 
search Associate, effective July 1, 1951. 





614 NEWS AND NOTICES 


Mr. Jack Kiefer will be an Instructor in Mathematics at Cornell University 
during 1951-1952. 

Dr. Erich L. Lehmann, Assistant Professor and Research Associate, Statistical 
Laboratory, University of California, Berkeley, has been promoted to Associate 
Professor and Research Associate, effective July 1, 1951. During the academic 
year 1951-1952 Professor Lehmann will be in a visiting capacity at Stanford 
University, on leave from the University of California. 

Mr. Edward M. Schrock, formerly Division Engineer, General Electric Co., 
Erie, Pennsylvania, is now Supervisor of Quality Control, Lukens Steel Com- 
pany, Coatesville, Pennsylvania. 

Dr. Elizabeth L. Scott, Instructor and Research Associate, Statistical Labora- 
tory, University of California, Berkeley, was promoted to Assistant Professor 
and Research Associate, effective July 1, 1951. 

Dr. Irving H. Siegel has left Johns Hopkins University, where he served as 
Lecturer in Political Economy and as Director of Productivity Studies in the 
Operations Research Office, to become Director of the American Technology 
Study for the Twentieth Century Fund in Washington. 

Dr. Andrew Sobezyk is on leave of absence from Boston University to serve 
as a Staff Member at Los Alamos Scientific Laboratory, University of Cali- 
fornia. Los Alamos, New Mexico. 

Mr. Robert F. Tate, Associate and Research Assistant, Statistical Laboratory, 
University of California, Berkeley, was promoted to Lecturer and Research 
Assistant, effective July 1, 1951. 

Dr. Shanti A. Vora has resigned his assistant professorship in the Department 
of Statistics, Stanford University, to accept a position as Statistician with the 
Standard-Vacuum Oil Co., India Division Office, Bombay. 


Awards for Post-doctoral Study in Statistics at the University of Chicago 


The Committee on Statistics (a department) of the University of Chicago has 
established, under a five-year grant from the Rockefeller Foundation, a program 
of Post-doctoral Awards to provide training and experience in statistics for 
scholars whose main interests lie outside that field. There will be three Awards 
per year, to holders of the doctorate or equivalent in the biological, the physical, 
and the social sciences. Each Award will be $4000 or slightly more, office space 
will be provided, and $600 to $1000 will be available for clerical, computational, 
and research assistance. There will be no tuition charges. 

The purpose of the Awards is to give statistical training to a few scientists 
who may be expected to employ it both to the direct advance of their specialties 
and to the enlightenment of their colleagues and students by example, by con- 
sultation, and by formal instruction. The development of the field of statistics 
has been so rapid that problems of communication are a serious obstacle to its 
full exploitation. The amount and quality of instruction available to young 
students is constantly increasing, but there is a real need, which these Awards 
seek to fill, for making appropriate instruction available to already established 





NEWS AND NOTICES 615 


scientific workers who give promise of immediate applications of statistics to 
their special fields. 

Recipients of the Awards must have received the doctor’s degree prior to 
commencing the program, except in the case of recognized research workers 
whose experience and accomplishments are clearly the equivalent. Candidates 
whose mathematical preparation includes less than the usual sophomore year of 
calculus, or its equivalent, will not ordinarily be considered, but previous train- 
ing in statistics is not required or expected. Candidates having under way re- 
search programs in their own fields will be preferred, and the department of the 
University of Chicago concerned with a candidate’s specialty will be asked to 
participate in evaluating his application. Recipients must spend eleven months 
studying statistics at the University of Chicago, and will be expected to pursue 
a number of regular courses. 

Applications, or requests for further information, should be sent to: Committee 
on Statistics, University of Chicago, Chicago 37. Applications for the academic 
year 1952-53 should arrive by April 1, 1952. 


Proceedings of the Second Berkeley Symposium 


The Proceedings of the Second Berkeley Symposium on Mathematical Statistics 
and Probability, 1950, contains forty-six articles on mathematical statistics, 
probability, and their applications to astronomy, biometry, econometrics, physics, 
traffic engineering, and wave analysis, by the following authors: T. W. Anderson, 
K. J. Arrow, E. W. Barankin, D. M. Belmont, J. Berkson, D. S. Berry, D. 
Blackwell, G. W. Brown, K. L. Chung, W. G. Cochran, H. Cramér, B. de Finetti, 
J. L. Doob, A. Dvoretzky, P. Erdés, W. Feller, R. P. Feynman, T. W. Forbes, 
R. Fortet, M. A. Girshick, T. E. Harris, L. G. Henyey, J. L. Hodges, Jr., W. 
Hoeffding, P. G. Hoel, H. Hotelling, M. Kac, 8. Kakutani, J. Kampé de Fériet, 
H. W. Kuhn, R. S. Lehman, E. L. Lehmann, V. F. Lenzen, P. Lévy, H. W. 
Lewis, B. Lindblad, M. Loéve, J. Marschak, A. M. Mood, G. Placzek, H. Rob- 
bins, P. Rudnick, L. J. Savage, E. L. Scott, H. R. Seiwell, O. Struve, R. J. 
Trumpler, A. W. Tucker, A. Wald, W. A. Wallis, J. Wolfowitz, A. Zygmund. 


The price of this volume of 666 pages is $11.00. Orders for the Proceedings 
should be addressed to University of California Press, Berkeley 4, California. 


0 nn 


New Members 


The following persons have been elected to membership in the Institute 
(June 1, 1951 to August 22, 1951) 


Allen, Stephen G., Jr., M.A. (Univ. of Chicago), Research Associate, Applied Mathe- 
matics and Statistics Laboratory, Stanford University, Stanford, California. 

Amundsen, Mrs. Herdis T., Aktuarkandidat (Univ. of Oslo), Lecturer in Mathematical 
Statistics, University of Oslo, Oslo, Norway. 





ee cmaa 


a eeneaaaal 


616 NEWS AND NOTICES 


Appel, Frederick W., Ph.D. (Univ. of Chicago), Executive Secretary, Public Health and 
Experimental Therapeutics Study Sections, Division of Research Grants, National 
Institutes of Mealth, Public Health Service, 3450 — 38th Street, N.W., Washington /6, 
D.C. 

Berg, William D., Ph.D. (Univ. of Iowa), Assistant Professor, University of Ohio, Bor 
285, Gambier, Ohio 

Botzum, Rev. William A., Ph.D. (Univ. of Chicago), Instructor, Department of Education, 
University of Notre Dame, Notre Dame, Indiana. 

Davison, George H., Secretary, The United Steel Companies, Ltd., Research & Develop- 
ment Department, Swinden House, Moorgate, Rotherham, England. 

de Agrisqueta, Francisco, Ph.D. (Univ. de Deusto, Bilbao, Spain), Acting Secretary Gen 
eral of Inter American Statistical Institute, 1801 Clydesdale Place, N.W., Apartment 
209, Washington 9, D.C. 

de la Garza, Andres, B.S., Statistician, 105 Pacific, Oak Ridge, Tennessee. 

Foscue, Augustus W., Jr., M.B.A. (Stanford Univ.), Professor of Accounting and Statistics 
and Chairman of Statistics Department, School of Business Administration, Southern 
Methodist University, Dallas 5, Texas. 

Fritz, Edward, M.S. (Univ. of Michigan), Research Engineer, Franklin Institute. 4026 
Girard Avenue, Philadelphia, Pennsylvania. 

Goulden, Cyril H., Ph.D. (Univ. of Minnesota), Chief, Cereal Division, Experimental 
Farm, Ottawa, Canada. 

Grant, J. Douglas, M.A. (Stanford Univ.), Clinical Psychologist, U. 
Command, Mare Island, Vallejo, California. 

Green, Bert F., Jr., Ph.D. (Princeton), Staff Member, Research Laboratory of Electronics, 
Room 22A—233c, Massachusetts Institute of Technology, Cambridge 39, Massa- 
chusetts. 

Hadden, Stuart T., M.A. (Temple Univ.), Chemical Engineer and Senior Technologist, 
Research and Development Department, Socony-Vacuum Oil Co., Inc., Paulsboro, 
New Jersey, 416 S. Jackson Street, Woodbury, New Jersey. 

Hildreth, Clifford, Ph.D. (Iowa State College), Associate Professor, Cowles Commission 
for Research in Economics, University of Chicago, Chicago 37, Illinois. 

Howe, William G., B.A. (Univ. of Rochester), Statistician, Color Control Department, 
Kodak Park, Eastman Kodak Company, Rochester, New York. 

Jackson, Patricia L., Ph.D. (Columbia Univ.), Employment Manager, Alexander’s Depart- 
ment Stores, Inc., Grand Concourse & Fordham Road, Bronx, New York. 

Kaelin, Alois, Dipl]. Math. (Eidgenossischen Technischen Hochschule, Zurich), Waldlistr. 4, 
Zurich 7/32, Switzerland. 

Kibbey, Milton E., B.A. (Univ. of Michigan), Development Engineer and Member of 
Statistical Analysis Group, 351 W. Outer Drive, Oak Ridge, Tennessee. 

Kuffner, Peter K., B.S. (Univ. of Chicago), Student, Department of Mathematics, Univer- 
sity of Chicago, 4002 S. Brighton Place, Chicago 32, Illinois. 

Kurkjian, Badrig M., S.B. (Mass. Inst. of Tech.), Mathematician, Room 300, Building 92, 
National Bureau of Standards, 21/7 Webster Street, N.E., Washington, D.C. 

Lewsley, Bernard J., B.S. (Birmingham Univ., England), Research and Development 
Engineer, General Electric Company, Ltd., 72 Ansty Road, Coventry, England. 

Marans, Frances A., B.S. (Wilson Teachers College, Wash., D. C.), Mathematical Statis- 
tician, Office of the Statistical Consultant, Division of Manpower and Employment 
Statistics, Bureau of Labor Statistics, Department of Labor, 1336 Missouri Avenue, 
N.W., Apt. 218, Washington 11, D.C. 

Mitten, Loring G., M.S. (Mass. Inst. of Tech.), Instructor, Department of Industrial 


Engineering, Ohio State University, Industrial Engineering Building, Columbus 10, 
Ohio. 


ival Retraining 


Moonan, William J., M.A. (Univ. of Minnesota), Instructor in Statistics Laboratory, Col- 
lege of Education, University of Minnesota, 23 FE. 54th Street, Minneapolis, Minnesota. 





MINNEAPOLIS MEETING 617 


Moore, Cordell B., M.S. (Univ. of Kentucky), Instructor, Department of Mathematics, 
University of Kentucky, Lexington, Kentucky. 

Sachs, David, A.B. (Columbia Univ.), Student, Department of Mathematicai Statistics, 
Columbia University, 76 Central Park West, New York 23, New York. 

Smith, Cecil W., Analysis Officer, No. 3 Line, British Overseas Airways Corporation, 107 
Pembroke Road, Clifton, Bristol 8, England. 

Stalnaker, John M., M.A. (Univ. of Chicago), Director of Studies, Association of American 
Medical Colleges, 1075 Elm Street, Winnetka, Illinois. 

Swalm, R. O., B.S. (Univ. of Pa.), Assistant Professor of Industrial Engineering, Syracuse 
University, 9 Wyncrest Drive, East Syracuse, New York. 

Thomson, Kenneth F., Ph.D. (Ohio State Univ.), Assistant Project Director, Richardson, 
Bellows, Henry & Co., 4839 Madison Avenue, New York City, 26 Burbank Street, Yon- 
kers 2, New York. 

Titman, Richard H., B.S. (Univ. of Wash.), Statistician, General Electric Co., Nucleonics 
Division, 711 Stanton, Richland, Washington. 

Vinci, Felice, Doctor Juris. (Univ. of Palermo, Italy), Professor of Statistics and Director 
of Instituto di Scienze Economiche e Statistiche, University of Milan, Via Lamarmora 
42, Milano, Italy. 

White, Aubrey, B.A. (Univ. of Toronto), Consulting Actuary and member of Ostheimer & 
Co., 1500 Chestnut Street, Philadelphia, Pennsylvania. 

White, John S., M.A. (Univ. of Minnesota), Graduate Student, University of Minnesota, 
134 West 62nd Street, Minneapolis 19, Minnesota. 

Whitney, Alfred G., Ed.M. (Harvard), Research Associate, Life Insurance Agency Manage- 
ment Association, 855 Asylum Avenue, Hartford 5, Connecticut. 

Winer, Ben J., M.S. (Univ. of Oregon), Research Associate, Personnel Research Board, 
Ohio State University, Columbus, Ohio. 

Wittenborn, J. R., Ph.D. (Univ. of Illinois), Research Associate in Psychology, Yale Uni- 
versity, 38 Cedar Street, New Haven, Connecticut. 

Zobel, Sigmund P., M.B.A. (Univ. of Buffalo), Lecturer in Statistics, School of Business 
Administration, also Assistant in Preventive Medicine and Public Health, School of 
Medicine, University of Buffalo, 105 Landon Street, Buffalo 8, New York. 


OE ea 


REPORT OF THE MINNEAPOLIS MEETING OF THE INSTITUTE 


The thirteenth summer meeting and forty-eighth meeting of the Institute of 
Mathematical Statistics was held at the University of Minnesota, September 
4-7, 1951, in conjunction with the summer meeting of the Mathematical Asso- 
ciation of America, the summer meeting of the American Mathematical Society 
and the Minneapolis meeting of the Econometric Society. The meeting was 
attended by the following one hundred and six members of the Institute: 


G. E. Albert, R. L. Anderson, T. W. Anderson, K. J. Arnold, W. D. Baten, Helen P. 
Beard, Agnes Berger, Joseph Berkson, Jean Bronfenbrenner, Irwin Bross, Hobart Bushey, 
Maria Castellani, Herman Chernoff, A. G. Clark, T. F. Cope, A. H. Copeland, Sr., L. M. 
Court, E. L. Cox, J. F. Daly, G. B. Dantzig, W. J. Dixon, J. L. Doob, Aryeh Dvoretzky, 
P.S. Dwyer, Churchill Eisenhart, Lillian Elveback,H.P. Evans, C. H. Fischer, Evelyn Fix, 
J.S. Frame, Robert Gage, H. M. Gehman, L. A. Goodman, John Gurland, P. C. Hammer, 
T. E. Harris, H. L. Harter, Clifford Hildreth, J. L. Hodges, Jr., Wassily Hoeffding, J. F. 
Hofmann, Robert Hogg, Harold Hotelling, H. M. Hughes, 8. L. Isaacson, P. O. Johnson, 
Walbert Kalinowski, Leo Katz, Harriet J. Kelly, O. Kempthorne, M. G. Kendall, Jack 
Kiefer, W. M. Kincaid, T. C. Koopmans, C. F. Kossack, R. L. Kozelka, William Kruskal, 





Paneer aires k tears haeiniaciness7seehtaneraren eae 


618 MINNEAPOLIS MEETING 


O. E. Lancaster, F. C. Leone, Howard Levene, 8S. B. Littauer, R. B. McHugh, 
H. B. Mann, Margaret P. Martin, K. O. May, G. F. T. Mayer, Paul Meier, M. R. Mickey, 
Sigeiti Moriguti, Frederick Mosteller, Jerzy Neyman, M. L. Norden, I. Olkin, Arthur 
Ollivier, Toby Oxtoby, M. P. Peisakoff, G. B. Price, J. A. Rafferty, Howard Raiffa, F. D. 
Rigby, D. D. Rippe, Herman Rubin, Henry Scheffé, Elizabeth L. Scott, W. B. Simpson, 
Andrew Sobczyk, Milton Sobel, M. D. Springer, Charles Stein, Joseph Talacko, W. F. 
Taylor, Henry Teicher, D. Teichroew, D. J. Thompson, L. J. Tick, G. Tintner, A. E. 
Treloar, A. W. Tucker, J. W. Tukey, S. A. Tyler, E. C. Varnum, John von Neumann, D. 
F. Votaw, J. 8. White, J. Wolfowitz, and M. A. Woodbury. 


The meeting opened on Tuesday morning, September 4, 1951,,with a Sym- 
postum on Medical Statistics. Professor Alan Treloar of the University of Minne- 
sota was chairman. Dr. Joseph Berkson of the Mayo Clinic presented a paper 
entitled Estimate of Effectiveness of Cancer Therapy from Mortality following T reat- 
ment. Professor Jerzy Neyman of the University of California presented a paper 
entitled Further Results Concerning the Follow-up Procedure. Prepared discussion 
of the papers was offered by Miss Lillian Elveback of the University of Minne- 
sota, Dr. Evelyn Fix of the University of California and Dr. W. F. Taylor of 
the School of Aviation Medicine. Approximately 55 persons attended the session. 

During the meeting the Institute joined the Econometric Society in three 
sessions devoted to a Symposium on the Theory of Games, Decision Problems and 
Related Topics. The first of these sessions with the title Theory of Games for the 
session was held on Tuesday afternoon under the chairmanship of Professor 
John von Neumann of the Institute for Advanced Study. Over 250 persons 
attended this session. Papers were presented by Professor Samuel Karlin of 
Princeton University and the Rand Corporation and by Dr. Olaf Helmer of the 
Rand Corporation. Prepared discussion was presented by Dr. Seymour Sherman 
of Lockheed Aircraft Corporation, Dr. L. 8. Shapley of Princeton University 
and the Rand Corporation, Professor Howard Raiffa of the University of Michi- 
gan, and Dr. G. B. Dantzig of the Department of the Air Force. 

On Wednesday morning the first of three sessions for contributed papers was 
held under the chairmanship of Professor W. J. Dixon of the University of 
Oregon. Attendance was approximately 50. The following papers were presented : 


1. On Stieltjes Integral Equations of Stochastic Processes. Maria Castellani, University 
of Kansas City. 

2. An Unfavorable Aspect of the Likelihood Ratio Test. L. M. Court, Rutgers University. 

3. Impartial Decision Rules and Sufficient Statistics. R. R. Bahadur and L. A. Goodman, 
University of Chicago. 

4. Contributions to the Statistical Theory of Counter Data. G. E. Albert, University of 
Tennessee and Oak Ridge National Laboratory, and M. L. Nelson, Oak Ridge National 
Laboratory. 

5. On the Use of Wald’s Classification Statistic. H. L. Harter, Michigan State College. 


Later on Wednesday morning, Session II—Statistical Decision Problems—of 
the Symposium on the Theory of Games, Decision Problems and Related Topics 
was held with Professor A. W. Tucker of Princeton University as chairman. 
Approximately 175 persons attended. Papers were presented by Professor David 





MINNEAPOLIS MEETING 619 


Blackwell of Howard University and Professor Charles Stein of the University 
of Chicago. Prepared discussion was presented by Dr. M. P. Peisakoff of the 
Rand Corporation, Professor Aryeh Dvoretsky of Hebrew University, Jerusa- 
lem, and Cornell University, Professor Herman Chernoff of the University of 
Illinois, Professor J. L. Hodges of the University of California, and Professor 
Samuel Karlin of Princeton University and the Rand Corporation. 

On Thursday morning a Symposium on Probability and Statistical Inference 
was held jointly with the Econometric Society. Professor Leo Katz of Michigan 
State College was chairman. Approximately 140 persons attended. Professor 
Jerzy Neyman of the University of California presented a paper on Inductive 
Behavior and Professor J. W. Tukey of Princeton University presented a paper 
on Purposes of Fiducial Inference. Prepared discussion was presented by Pro- 
fessor Leonid Hurwicz of the University of Minnesota and Professor Gerhard 
Tintner of Iowa State College. 

The Third Rietz Memorial Lecture was given by Professor Harold Hotelling 
of the University of North Carolina at 2 p.m. Thursday, September 6, 1951 
under the title, The Behavior of Standard Statistical Testis under Nonstandard 
Conditions. Professor Jerzy Neyman of the University of California was chair- 
man of the session. The attendance was approximately 120. 

Later on Thursday afternoon, Session I1I—Decision Making and Theory of 
Organization—of the Symposium on the Theory of Games, Decision Problems and 
Related Topics was held with Professor T. C. Koopmans of the Cowles Commis- 
sion as chairman. Approximately 100 persons attended. A paper by Professor 
Leonid Hurwicz of the University of Minnesota was followed by prepared dis- 
cussion by Dr. Norman Dalkey of the Rand Corporation, Professor David Gale 
of Brown University, and Commander W. H. Keen of the Naval Air Develop- 
ment Center. A paper by Dr. M. M. Flood of the Rand Corporation was fol- 
lowed by prepared discussion by Professor C. B. Tompkins of George Washing- 
ton University and Professor H. A. Simon of Carnegie Institute of Technology. 

Early Friday morning the second session for contributed papers, attended by 
approximately 50 people, was held with Professor Henry Scheffé of Columbia 
University as chairman. The following papers were presented: 


6. Polynomial Determination in a Field of Integers Modulo P. E. C. Varnum, Barber- 
Colman Company. i 

7. About Some Symmetrical Distributions from the Perks’ Family of Functions. J. V. 
Talacko, Marquette University. 

8. A Large-Sample Test for the Variation of Sample Covariance Matrices. D. D. Rippe, 
University of Michigan. 

9. Probability Models for Analyzing Time Changes in Attitudes. T. W. Anderson, Colum- 
bia University. 

10. The Variance of a Weighted Average Using Estimated Weights. Paul Meier, Princeton 
University. 


An Abraham Wald Memorial Session was held at 10:30 a.m. Friday, Septem- 
ber 7, 1951. Professor Harold Hotelling of the University of North Carolina 





| 
: 
) 
: 


620 MINNEAPOLIS MEETING 


was chairman. Approximately 100 people attended the session. A paper on 
Wald’s Contributions in Pure Mathematics by Professor Karl Menger of Illinois 
Institute of Technology was read by Professor J. L. Kelley of Tulane Univer- 
sity. A paper on Wald’s Contributions in Econometrics was presented by Professor 
Gerhard Tintner of Iowa State College and a paper on Wald’s Contributions in 
Mathematical Statistics was presented by Professor J. Wolfowitz of Cornell 
University. 

Early Friday afternoon a session under the chairmanship of Professor Paul S. 
Dwyer of the University of Michigan was devoted to two invited addresses. 
Approximately 70 persons attended. Dr. T. E. Harris of the Rand Corporation 
spoke on First Passage and Recurrence Distributions and Professor M. G. Ken- 
dall of the London School of Economics spoke On the Systematic Determination 
of Sampling Distributions. 

Later on Friday afternoon the third session for contributed papers was held 
under the chairmanship of Professor K. J. Arnold of the University of Wisconsin. 
Approximately 40 persons attended. The following papers were presented: 


11. Distribution of Ratios of Quadratic Forms. John Gurland, University of Chicago. 

12. The Large-Sample Power of Tests Based on Permutations of Observations. Preliminary 
Report. Wassily Hoeffding, University of North Carolina. 

13. A Complete Class of Decision Procedures for Distributions with Monotone Likelihood 
Ratio. Herman Rubin, Stanford University. 


The following papers were presented by title at the meeting: 


14. Some Nonparametric Results for Experimental Designs. J. E. Walsh, Bureau of the 
Census. 


15. Efficient Tests and Confidence Intervals for Mortality Rates. J. E. Walsh, Bureau of 
the Census. 


16. Sufficient Statistics when the Carrier of the Distribution Depends on the Parameter. 
D. A. 8S. Fraser, University of Toronto. 

17. Bayes Solutions and Likelihood Ratio Tests of Some Simple and Composite Hypotheses. 
Preliminary Report. Allan Birnbaum, Columbia University. 

18. The Impossibility of Certain Affine Resolvable Balanced Incomplete Block Designs. 
S. 8S. Shrikhande, Nagpur College of Science, India. 

19. On Sufficiency and Statistical Decision Functions. R. R. Bahadur, University of 
Chicago and Delhi University, India. 

20. A Two Sample Test Procedure. D. B. Owen, University of Washington. 

21. A Combinatorial Central Limit Theorem. Wassily Hoeffding, University of North 
Carolina. 

22. Necessary Conditions for the Existence of a Symmetrical Group Divisible Design. R. 
C. Bose and W. S. Connor, Jr., University of North Carolina. 

23. On a Problem of Mapping of One Space on Another with Applications in Sampling 
Distributions. 8. N. Roy, University of North Carolina. 

24. On a Theorem in Jacobians with Statistical Applications. S. N. Roy, University of 
North Carolina. 


25. The Inventory Problem. A. Dvoretzky, J. Kiefer and J. Wolfowitz, Cornell 
University. 


The Council of the Institute held a meeting at 1:30 p.m. on Wednesday, 
September 5, 1951. 





PUBLICATIONS RECEIVED 621 


A business meeting of the Institute was held at 9 a.m. on Thursday, Septem- 
ber 6, 1951. 

Social events included a reception sponsored by the mathematical organiza- 
tions and the College of St. Thomas on Tuesday evening, a banquet on Wednes- 
day evening, a piano recital on Thursday evening and a beer party sponsored 
by the Institute later on Thursday evening. 


K. J. ARNOLD 
Associate Secretary 


oT 


PUBLICATIONS RECEIVED 


Anuario Estadistico de Espafia, (Instituto Nacional de Estadistica), Presidencia del Go- 
bierno, Madrid, 1950, xliii + 1048 pp. 

Anuario Estadistica de Espaiia (Edicion Manual), (Instituto Nacional de Estadistica), 
Presidencia de] Gobierno, Madrid, 1951, lvii + 944 pp. 

WaLKER, HELEN M., Mathematics Essential for Elementary Statistics, rev. ed., Henry Holt 
and Company, New York, 1951, viii + 382 pp., $2.75. 





ere camciesanenines iene RA) He 


PRA LITRE ANA NA ELEN ONE LIE LEMS 





ESTADISTICA 


Official Journal of the Inter American Statistical Institute 


Vol. IX, No. 32 September 1951 
Contents 


Estadisticas de las Finanzas Pablicas: Las Funciones del Presupuesto de los Gobi- 
ernos Centrales 


Notas sébre o Levantamento de Dados Bio-Estatisticos na Amazonia Brasi- 
leira a ie PEPE AG. ACHILLES SCORZELLI JR. 


Renta Nacional CSE: LorETO DOMINGUEZ 


El Problema de la Suficiencia de Pagos y de la Seguridad de Empleo para los Esta- 
disticos en las Agencias Oficiales de la América Latina 


Quelques Observations sur L’ Assimilation Linguistique des Immigrés au Brésil et de 
Leurs Descendants. . is chee ees GrorGIO MorTARA 


Informes sobre la I Sesién de la Comisién de Mejoramiento de las Estadisticas 
Nacionales, y sobre la [V Sesién de la Comisién del Censo de las Américas de 
1950, Washington, D. C., June 2-15, 1951. 


Editorial Notes. Institute Affairs. Statistical News. Publications. 


Editor: Francisco de Abrisqueta 


Inter American Statistical Institute, % Pan American Union, Washington 6, D.C., U. S. A. 





JOURNAL OF THE September 1951 


AMERICAN STATISTICAL ASSOCIATION Vol. 46 No. 255 
1108 16th Street, N. W., Washington 6, D. C. 


Statistics in Production and Inspection ss .....Epwin G. Ops 
The Verification and Scoring of Weather Forecasts : Irvinc I. GRINGORTEN 
Some Statistical Problems in Small Group Research . . Tee eben ...Ropert F, BALEs 
Relations between Prices, Consumption, and Production... .. so.s0cs Rant A Box 
The Distribution of the Range in Samples from a Discrete Rectangular Population 
Pau R. RIDER 
Actuarial Science ...ss....-CHARLES A, SPOERL 
Statistical Measurement and Economic Mobilization Wleesing . .GLENN E. McLavucHuim 
National Income . GEORGE Jaszi 
A Large-Sample Test of the Hy pothesis that One of Two Random V ariables Is Stochasti- 
cally Larger than the Other .. vsseeeeeses+.. ANDREW W. MARSHALL 


REPRINTS OF ABSTRACTS IN STATISTICAL METHODOLOGY 
BOOK REVIEWS 


The American Statistical Association invites as members all per- 


sons interested in: 
1. development of new theory and method 
2. improvement of basic statistical data 
3. application of statistical methods to practical problems. 





BIOMETRIKA 
A Journal for the Statistical Study of Biological Problems 


Volume 38 Contents Parts 3 and 4, December 1951 


1. Biometrika 1901-1951. By W.P. ELDERTON. 2. Jacobians of certain matrix transformations useful in 
Multivariate analysis. By W. L. DEEMER and I. OLKIN. 3. A chart for the incomplete Beta function 
and the cumulative binomial distribution. By H. O. HARTLEY and E. R. FITCH. 4. The effect of 
standardization on a x? approximation in factor analysis. By M. S. BARTLETT. 5. Some systematic 
experimental designs. By D. R. COX. 6. On estimating the size of mobile populations from recapture 
data. By N.T.J. BAILEY. 7. The comparison of several groups of observations when the ratios of the 
population variances are unknown. By G.S. JAMES. 8. On the comparison of several mean values: an 
alternative approach. By B. L. WELCH. 9. Tables of symmetric functions: Pts. Il and III. By F. N. 
DAVID and M.G. KENDALL. 10. A mathematical theory of anima! trapping. By P. A. P. MORAN. 
11. Two applications of bivariate k-statistics. By B. M. COOK. 12. Expected frequencies in a sample of 
an animal population in which the abundances of species are lognormally distributed: Pt. I, Theory; Pt. I, 
Application. By P.M.GRUNDY. 13. The fitting of polynomials to equidistant data with missing values- 
By H.O. HARTLEY. 14. The delay to pedestrians crossing a road. By J.C. TANNER. 15. Interrela- 
tions between certain linear systematic statistics of samples from any continuous population. By G. P. 
SILLITTO. 16. Truncated log-normal distributions: I. Solution by moments. By H. R. THOMPSON. 
17. Further applications of range to the analysis of variance. By H. A. DAVID. 18. The estimation of 
population parameters from data obtained by means of the capture-recapture method: I. The maximum 
likelihood equation for estimating the death rate. By P. H LESLIE and DENNIS CHITTY. 19. MIS- 
CELLANEA. 20. REVIEWS. 


The subscription price, payable in advance, is 45s. inland, 54s. export (per volume including postage). Cheques 
should be drawn to Biometrika and sent to “The Secretary, Biometrika Office, Department of Statistics, 
University College, London, W.C. 1." All foreign cheques must be in sterling and drawn on a bank 
having a London agency. 





ECONOMETRICA 


Journal of the Econometric Society 


Contents of Vol. 19, October, 1951, include: 


OsKAR MORGENSTERN icons ... Abraham Wald, 1902-1950 
ABRAHAM WALD oe ; On Some 
Systems of Equations of Mathematical Economics (translated by Otto Eckstein) 

LAWRENCE R. KLEIN 
Estimating Patterns of Savings Behavior from Sample Survey Data 

KENNETH J. ARROW 
Alternative Approaches to the Theory of Choice in Risk-Taking Situations 
TJALLING C. Koopmans Efficient Allocation of Resources 
{ENE Roy La Demande des Biens Indirects 
List of Members of the Econometric Society. Geographical List of Members and 

Subscribers. Book Reviews and Notices of Meetings. 


Published Quarterly Subscription to Nonmembers: $9.00 per year 


The Econometric Society is an international society .for the advancement of economic theory in its 
relation to statistics and mathematics. 


Subscriptions to Econometrica and inquiries about the work of the Society and the procedure in applying 
for membership should be addressed to William B. Simpson, Secretary, The Econometric Society, The 
University of Chicago, Chicago 37, Illinois, U. 8. A. 








MATHEMATICAL REVIEWS 


A journal containing reviews of the mathematical liter- 
ature of the world, with full subject and author indices 


Publication of this journal is sponsored by the American Mathe- 
matical Society, Mathematical Association of America, Institute of 
Mathematical Statistics, London Mathematical Society, Edinburgh 
Mathematical Society, Union Matematica Argentina, and others. 


Subscriptions accepted to cover the calendar year only. 
Issues appear monthly except July. $20.00 per year. 
Send subscription order or request for sample copy to 


a AMERICAN MATHEMATICAL SOCIETY 
80 Waterman Street, Providence 6, Rhode Island 





JOURNAL OF THE 
ROYAL STATISTICAL SOCIETY 


Series B (Methodological) 


Contents of Volume 13, No. I, 1951 


G. E. P. Box ann K. B. Witson 
On the Experimental Attainment of Optimum Conditions. (With Discussion) 
G. A. BARNARD .... ; . The Theory of Information. (With Discussion) 
F. BENSON AND D. R. Cox 
The Productivity of Machines Requiring Attention at Random Intervals 
L. Fox anp J. G. Hayes More Practical Methods for the Inversion of Matrices 
S. RUSHTON 
On Least Square Fitting by Orthonormal Polynomials Using the Choleski Method 
BARNET WOOLF. . Computation and Interpretation of Multiple Regressions 
M. P. ScHUTZENBERGER 
An Extension Problem in the Theory of Incomplete Block Designs 
H. R. Tomson anv I. D. Dick 
Factorial Designs in Small Blocks Derived from Orthogonal Latin Squares 
ALLADI RAMAKRISHNAN Detar cstsh Some Simple Stochastic Processes 
P. A. P. MoRAN...... ‘ Estimation Methods for Evolutive Processes 
P. A. P. MoRAN.... The Random Division of an Interval—Part IT 


The Royal Statistical Society, 4, Portugal Street, London, W.C.2. 








SKANDINAVISK 
AKTUARIETIDSKRIFT 


1951 - Parts 1 - 2 
Contents 


HaraLp BerGstréM.......On Asymptotic Expansions of Probability Functions 
Epwarp W. BaRANKIN 

Concerning Some Inequalities in the Theory of Statistical Estimation 

MartTIn SANDELIUS Truncated Inverse Binomial Sampling 

Knut MEpDIN A Function for Smoothing Tables of the Duration of Sickness 

Martin WEIBULL 

The Regression Problem Involving Non-random Variates in the Case of 

Stratified Sample from Normal Parent Populations with Varying Regression 

Coefficients 

K.-G. HAGSTROEM Erik Stridsbergt 


Annual subscription: 10 Swedish Crowns (Approx. $2.00). 
Inquiries and orders may be addressed to the Editor, 
SKARVIKSVAGEN 7, DJURSHOLM (SWEDEN) 


SANKHYA 


The Indian Journal of Statistics 
Edited by P. C. Mahalanobis 


Vol. 11, Part 1, 1951 


In Memoriam: Abraham Wald 
On the Realization of Stochastic Processes by Probability Distributions in 
Function Spaces Henry B. MANN 
A Theorem in Least Squares C. R. Rao 
On Type B,; and Type B Regions H. K. Nanpi 
Some Notes on Ordered Samples from a Normal Population . K. C. S. Privat 
Some Exponential Forms for Topographic Correlation BIRENDRANATH GHOSH 
On the Orthogonal Polynomials Associated with Student’s Distribution. 
A. 8S. KrIsHNAMOORTHY 
A Multivariate Gamma-Type Distribution V. K. RAMABHADRAN 
A Study on Differences in Physical Development by Socio-Economic Strata. 
RAMKRISHNA MUKHERJEE 
U.N. Commission on Statistical Sampling—Report. 


Annual! subscription: 30 rupees 
Inquiries and orders may be addressed tothe _ 
Editor, Sankhya, Presidency College, Calcutta, India. 








aw 





